-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #33 from tamada/documents
Documents
- Loading branch information
Showing
17 changed files
with
358 additions
and
41 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
[submodule "docs/themes/cayman-hugo-theme"] | ||
path = docs/themes/cayman-hugo-theme | ||
url = https://github.com/zwbetz-gh/cayman-hugo-theme.git |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
### https://raw.github.com/github/gitignore/991e760c1c6d50fdda246e0178b9c58b06770b90/community/Golang/Hugo.gitignore | ||
|
||
# Generated files by hugo | ||
/public/ | ||
/resources/_gen/ | ||
|
||
# Executable may be added to repository | ||
hugo.exe | ||
hugo.darwin | ||
hugo.linux | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
PWD := ${CURDIR} | ||
DOCKER_IMAGE_NAME := "wwwscv" | ||
CONTAINER_REMOVE_FLAG := "--rm" | ||
BASE_URL := "https://tamada.github.io/scv" | ||
HUGO_THEME := "cayman-hugo-theme" | ||
JOJOMI_VERSION := 0.83.1 | ||
|
||
build: | ||
docker run ${CONTAINER_REMOVE_FLAG} --name ${DOCKER_IMAGE_NAME}_build -v "${PWD}":/src -v ${PWD}/public:/output -e HUGO_THEME=$(HUGO_THEME) -e HUGO_BASEURL=${BASE_URL} jojomi/hugo:${JOJOMI_VERSION} | ||
rm -rf public/favicon*.png public/favicon.ico public/apple-touch-icon.png | ||
|
||
start: | ||
docker run ${CONTAINER_REMOVE_FLAG} -d --name ${DOCKER_IMAGE_NAME} -p 1313:1313 -v "${PWD}":/src -v "$(PWD)"/public:/output -e HUGO_THEME=$(HUGO_THEME) -e HUGO_WATCH="true" -e HUGO_BASEURL=${BASE_URL} jojomi/hugo:${JOJOMI_VERSION} | ||
|
||
stop: | ||
docker stop wwwrrh |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
baseURL = "https://tamada.github.io/scv" | ||
languageCode = "en-us" | ||
defaultContentLanguage = "en" | ||
title = "SCV" | ||
enableEmoji = true | ||
theme = "cayman-hugo-theme" | ||
# googleAnalytics = "UA-62401730-14" | ||
|
||
pygmentsCodefences = true | ||
pygmentsStyle = "pygments" | ||
|
||
[params] | ||
project_name = "scv" | ||
project_logo = "images/scale.png" | ||
project_tagline = "Similarities and distances calculator among vectors" | ||
dateFormat = "2006-01-02" | ||
katex = true | ||
|
||
footer = "[![GitHub](https://img.shields.io/badge/GitHub-tamada/scv-blueviolet.svg?logo=github)](https://github.com/tamada/scv) Made with [Hugo](https://gohugo.io/). Theme by [Cayman](https://github.com/zwbetz-gh/cayman-hugo-theme). Deployed to [GitHub Pages](https://pages.github.com/)." | ||
|
||
[menu] | ||
[[menu.nav]] | ||
name = ":house: Home" | ||
url = "/" | ||
weight = 10 | ||
[[menu.nav]] | ||
name = ":gem: Algorithms" | ||
url = "/algorithms" | ||
weight = 20 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,112 @@ | ||
--- | ||
title: ":house: Home" | ||
--- | ||
|
||
[![build](https://github.com/tamada/scv/actions/workflows/build.yml/badge.svg)](https://github.com/tamada/scv/actions/workflows/build.yml) | ||
[![Coverage Status](https://coveralls.io/repos/github/tamada/scv/badge.svg?branch=setup_ci)](https://coveralls.io/github/tamada/scv?branch=setup_ci) | ||
[![Go Report Card](https://goreportcard.com/badge/github.com/tamada/scv)](https://goreportcard.com/report/github.com/tamada/scv) | ||
[![codebeat badge](https://codebeat.co/badges/5221e6ba-da64-45c1-8b13-f833f678e3b9)](https://codebeat.co/projects/github-com-tamada-scv-main) | ||
|
||
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg?logo=spdx)](https://github.com/tamada/scv/blob/main/LICENSE) | ||
[![Version](https://img.shields.io/badge/Version-1.0.0-blue.svg)](https://github.com/tamada/scv/releases/tag/v1.0.0) | ||
|
||
[![Docker](https://img.shields.io/badge/Docker-ghcr.io%2Ftamada%2Fscv%3A1.0.0-green?logo=docker)](https://github.com/users/tamada/packages/container/package/scv) | ||
|
||
Similarities and distance Calculator among Vectors. | ||
|
||
## :speaking_head: Description | ||
|
||
There are several algorithms to calculate the similarities of two bectors; however, no commands are exists treats them. | ||
`scv` standardizes the interface for calculating the similarities and distances among vectors. | ||
|
||
|
||
## :runner: Usage | ||
|
||
### :question: CLI help message | ||
|
||
```sh | ||
scv [OPTIONS] <VECTORS...> | ||
OPTIONS | ||
-a, --algorithm <ALGORITHM> specifies the calculating algorithm. This option is mandatory. | ||
The value of this option accepts several values separated with comma. | ||
Available values are: simpson, jaccard, dice, cosine, pearson, | ||
euclidean, manhattan, chebyshev, and levenshtein. | ||
-f, --format <FORMAT> specifies the resultant format. Default is default. | ||
Available values are: default, json, and xml. | ||
-t, --input-type <TYPE> specifies the type of VECTORS. Default is file. | ||
If TYPE is separated with comma, each type shows | ||
the corresponding VECTORS. | ||
Available values are: byte_file, term_file, string, and json. | ||
-h, --help prints this message. | ||
VECTORS | ||
the source of vectors for calculation. | ||
``` | ||
|
||
## :athletic_shoe: Examples | ||
|
||
```sh | ||
$ scv -t string -a simpson distance similarity | ||
simpson(distance, similarity) = 0.5000 | ||
$ scv -t string -a jaccard,dice distance similarity | ||
jaccard(distance, similarity) = 0.3333 | ||
dice(distance, similarity) = 0.5000 | ||
``` | ||
|
||
### :whale: Docker | ||
|
||
[![Docker](https://img.shields.io/badge/Docker-ghcr.io%2Ftamada%2Fscv%3A1.0.0-green?logo=docker)](https://github.com/users/tamada/packages/container/package/scv) | ||
|
||
```sh | ||
docker run -it ghcr.io/tamada/scv:latest gives some strings for comparing | ||
``` | ||
|
||
If `scv` reads some files, `-v` option should be specified. | ||
|
||
```sh | ||
docker run -v $PWD:/home/scv -it ghcr.io/tamada/scv:latest -f json testdata/*.json | ||
``` | ||
|
||
#### versions | ||
|
||
- `1.0.0`, `latest` | ||
|
||
## :anchor: Install | ||
|
||
### :beer: Homebrew | ||
|
||
Simply type the following commands. | ||
|
||
``` | ||
brew tap tamada/brew | ||
brew install scv | ||
``` | ||
|
||
### Go lang | ||
|
||
``` | ||
go get github.com/tamada/scv | ||
``` | ||
|
||
### :muscle: Compile yourself | ||
|
||
``` | ||
git clone https://github.com/tamada/scv | ||
cd scv | ||
make | ||
``` | ||
|
||
## :smile: About | ||
|
||
### :man_office_worker: Authors :woman_office_worker: | ||
|
||
* Haruaki Tamada ([tamada](https://github.com/tamada)) | ||
|
||
### :scroll: License | ||
|
||
[Apache 2.0](https://github.com/tamada/scv/blob/main/LICENSE) | ||
|
||
### :jack_o_lantern: Icon | ||
|
||
![Icon](https://raw.githubusercontent.com/tamada/scv/main/docs/static/images/scale.png) | ||
|
||
This image is obtained from [iconscount.com](https://iconscout.com/icon/scale-217). |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
--- | ||
title: ":gem: Algorithms" | ||
--- | ||
|
||
## :speaking_head: Overview | ||
|
||
`scv` supports simpson, jaccard, dice, cosine, pearson, euclidean, manhattan, and chebyshev algorithms. | ||
Those algorithms separated into two categories, similarities, and distances. | ||
The followings describes each algorithms and how to calculate similarity/distance from two vectors. | ||
|
||
Before calculating the similarities and/or distance of the following algorithms, | ||
we assume that two vectors were given ($$v = \{ (a_1, b_1), (a_2, b_2), ..., (a_n, b_n) \}$$ and $$w = \{ (c_1, d_1), (c_2, d_2), ..., (c_m, d_m) \}$$). | ||
|
||
## Similarities | ||
|
||
Calculate similarities among the given vectors. | ||
For this, `scv` computes similarity between two vectors from the given vectors of their combinations. | ||
|
||
To describe each algorithm, | ||
Each element of the vector has key and value. | ||
|
||
### Simpson Index | ||
|
||
$$S = \frac{|\mathrm{intersect}(v, w)|}{\min(|v_1|, |v_2|)}$$ | ||
|
||
$$\mathrm{intersect}$$ function returns the new vector by common keys and sum of thier values ($$a_i = c_j (1 \leq i \leq n, 1 \leq j \leq m)$$. | ||
|
||
|
||
### Jaccard index | ||
|
||
$$J=\frac{|\mathrm{intersect}(v, w)|}{|\mathrm{union}(v, w)|}$$ | ||
|
||
$$\mathrm{intersect}$$ function returns the new vector which contains every keys of $$v$$ and $$w$$. | ||
|
||
### Dice index | ||
|
||
$$D=\frac{2\times|\mathrm{intersect}(v, w)|}{|v| + |w|}$$ | ||
|
||
### Cosine similarity | ||
|
||
$$C=\cos\theta=\frac{v\cdot w}{\sqrt{\sum_{i=0}^{n}b_i^2}\sqrt{\sum_{j=0}^{m}d_j^2}}$$ | ||
|
||
### Pearson correlation efficiency | ||
|
||
|
||
|
||
## Distances | ||
|
||
### Euclidean Distance | ||
|
||
|
||
|
||
### Manhattan Distance | ||
|
||
|
||
|
||
### Chebyshev Distance | ||
|
||
|
||
### Edit Distance |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{{ define "main" }} | ||
<h1>{{ .Title | emojify }}</h1> | ||
{{ .Content }} | ||
{{ end }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
{{ define "main" }} | ||
<h1>{{ .Title | emojify }}</h1> | ||
|
||
{{ .Content }} | ||
|
||
<ul> | ||
{{ $pages := where site.RegularPages "Type" "in" site.Params.mainSections }} | ||
{{ range $pages.ByPublishDate.Reverse }} | ||
<li> | ||
{{ $dateFormat := $.Site.Params.dateFormat | default "Jan 2, 2006" }} | ||
{{ .PublishDate.Format $dateFormat }} | ||
<a href="{{ .Permalink }}"> | ||
{{ .Title }} | ||
</a> | ||
</li> | ||
{{ end }} | ||
</ul> | ||
{{ end }} |
Oops, something went wrong.