Skip to content

Commit

Permalink
Merge pull request #33 from tamada/documents
Browse files Browse the repository at this point in the history
Documents
  • Loading branch information
tamada authored Jun 30, 2021
2 parents 8b83133 + fed2887 commit bcb515f
Show file tree
Hide file tree
Showing 17 changed files with 358 additions and 41 deletions.
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "docs/themes/cayman-hugo-theme"]
path = docs/themes/cayman-hugo-theme
url = https://github.com/zwbetz-gh/cayman-hugo-theme.git
2 changes: 2 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,10 @@ test: setup

define __create_dist
mkdir -p dist/$(1)_$(2)/$(DIST)
rm -rf dist/$(1)_$(2)/$(DIST)/docs
GOOS=$1 GOARCH=$2 go build -o dist/$(1)_$(2)/$(DIST)/$(NAME)$(3) main.go args.go printer.go input.go
cp -r README.md LICENSE completions dist/$(1)_$(2)/$(DIST)
cp -r docs/public dist/$(1)_$(2)/$(DIST)/docs
tar cfz dist/$(DIST)_$(1)_$(2).tar.gz -C dist/$(1)_$(2) $(DIST)
echo "Done $(1)_$(2)"
endef
Expand Down
32 changes: 27 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,7 @@
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg?logo=spdx)](https://github.com/tamada/scv/blob/main/LICENSE)
[![Version](https://img.shields.io/badge/Version-1.0.0-blue.svg)](https://github.com/tamada/scv/releases/tag/v1.0.0)

[![Docker](https://img.shields.io/badge/Docker-ghcr.io%2Ftamada%2Fscvt%3A1.0.0-green?logo=docker)](https://github.com/users/tamada/packages/container/package/scv)

[![Docker](https://img.shields.io/badge/Docker-ghcr.io%2Ftamada%2Fscv%3A1.0.0-green?logo=docker)](https://github.com/users/tamada/packages/container/package/scv)

Similarities and distance Calculator among Vectors.

Expand All @@ -35,7 +34,7 @@ OPTIONS
-t, --input-type <TYPE> specifies the type of VECTORS. Default is file.
If TYPE is separated with comma, each type shows
the corresponding VECTORS.
Available values are: file, string, and json.
Available values are: byte_file, term_file, string, and json.
-h, --help prints this message.
VECTORS
the source of vectors for calculation.
Expand All @@ -53,7 +52,7 @@ dice(distance, similarity) = 0.5000

### :whale: Docker

[![Docker](https://img.shields.io/badge/Docker-ghcr.io%2Ftamada%2Fscvt%3A1.0.0-green?logo=docker)](https://github.com/users/tamada/packages/container/package/scv)
[![Docker](https://img.shields.io/badge/Docker-ghcr.io%2Ftamada%2Fscv%3A1.0.0-green?logo=docker)](https://github.com/users/tamada/packages/container/package/scv)

```sh
docker run -it ghcr.io/tamada/scv:latest gives some strings for comparing
Expand All @@ -71,6 +70,29 @@ docker run -v $PWD:/home/scv -it ghcr.io/tamada/scv:latest -f json testdata/*.js

## :anchor: Install

### :beer: Homebrew

Simply type the following commands.

```
brew tap tamada/brew
brew install scv
```

### Go lang

```
go get github.com/tamada/scv
```

### :muscle: Compile yourself

```
git clone https://github.com/tamada/scv
cd scv
make
```

## :smile: About

### :man_office_worker: Authors :woman_office_worker:
Expand All @@ -83,6 +105,6 @@ docker run -v $PWD:/home/scv -it ghcr.io/tamada/scv:latest -f json testdata/*.js

### :jack_o_lantern: Icon

![Icon](https://github.com/tamada/scv/blob/main/docs/static/images/scale.png)
![Icon](https://raw.githubusercontent.com/tamada/scv/main/docs/static/images/scale.png)

This image is obtained from [iconscount.com](https://iconscout.com/icon/scale-217).
10 changes: 5 additions & 5 deletions completions/bash/scv.bash
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,16 @@ __scv() {
COMPREPLY=($(compgen -W "simpson jaccard dice cosine pearson euclidean manhattan chebyshev levenshtein" -- "${cur}"))
return 0
;;
--format | -f)
COMPREPLY=($(compgen -W "default xml json" -- "${cur}"))
--input-type | -t)
COMPREPLY=($(compgen -W "byte_file term_file string json" -- "${cur}"))
return 0
;;
--input-type | -t)
COMPREPLY=($(compgen -W "string json byte_file term_file" -- "${cur}"))
--format | -f)
COMPREPLY=($(compgen -W "default json xml" -- "${cur}"))
return 0
;;
esac
opts="-a --algorithm -f --format -t --input-type -h --help"
opts=" -a -f -t -h --algorithm --format --input-type --help"
if [[ "$cur" =~ ^\- ]]; then
COMPREPLY=( $(compgen -W "${opts}" -- "${cur}") )
return 0
Expand Down
12 changes: 12 additions & 0 deletions docs/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
### https://raw.github.com/github/gitignore/991e760c1c6d50fdda246e0178b9c58b06770b90/community/Golang/Hugo.gitignore

# Generated files by hugo
/public/
/resources/_gen/

# Executable may be added to repository
hugo.exe
hugo.darwin
hugo.linux


16 changes: 16 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
PWD := ${CURDIR}
DOCKER_IMAGE_NAME := "wwwscv"
CONTAINER_REMOVE_FLAG := "--rm"
BASE_URL := "https://tamada.github.io/scv"
HUGO_THEME := "cayman-hugo-theme"
JOJOMI_VERSION := 0.83.1

build:
docker run ${CONTAINER_REMOVE_FLAG} --name ${DOCKER_IMAGE_NAME}_build -v "${PWD}":/src -v ${PWD}/public:/output -e HUGO_THEME=$(HUGO_THEME) -e HUGO_BASEURL=${BASE_URL} jojomi/hugo:${JOJOMI_VERSION}
rm -rf public/favicon*.png public/favicon.ico public/apple-touch-icon.png

start:
docker run ${CONTAINER_REMOVE_FLAG} -d --name ${DOCKER_IMAGE_NAME} -p 1313:1313 -v "${PWD}":/src -v "$(PWD)"/public:/output -e HUGO_THEME=$(HUGO_THEME) -e HUGO_WATCH="true" -e HUGO_BASEURL=${BASE_URL} jojomi/hugo:${JOJOMI_VERSION}

stop:
docker stop wwwrrh
29 changes: 29 additions & 0 deletions docs/config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
baseURL = "https://tamada.github.io/scv"
languageCode = "en-us"
defaultContentLanguage = "en"
title = "SCV"
enableEmoji = true
theme = "cayman-hugo-theme"
# googleAnalytics = "UA-62401730-14"

pygmentsCodefences = true
pygmentsStyle = "pygments"

[params]
project_name = "scv"
project_logo = "images/scale.png"
project_tagline = "Similarities and distances calculator among vectors"
dateFormat = "2006-01-02"
katex = true

footer = "[![GitHub](https://img.shields.io/badge/GitHub-tamada/scv-blueviolet.svg?logo=github)](https://github.com/tamada/scv) Made with [Hugo](https://gohugo.io/). Theme by [Cayman](https://github.com/zwbetz-gh/cayman-hugo-theme). Deployed to [GitHub Pages](https://pages.github.com/)."

[menu]
[[menu.nav]]
name = ":house: Home"
url = "/"
weight = 10
[[menu.nav]]
name = ":gem: Algorithms"
url = "/algorithms"
weight = 20
112 changes: 112 additions & 0 deletions docs/content/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
title: ":house: Home"
---

[![build](https://github.com/tamada/scv/actions/workflows/build.yml/badge.svg)](https://github.com/tamada/scv/actions/workflows/build.yml)
[![Coverage Status](https://coveralls.io/repos/github/tamada/scv/badge.svg?branch=setup_ci)](https://coveralls.io/github/tamada/scv?branch=setup_ci)
[![Go Report Card](https://goreportcard.com/badge/github.com/tamada/scv)](https://goreportcard.com/report/github.com/tamada/scv)
[![codebeat badge](https://codebeat.co/badges/5221e6ba-da64-45c1-8b13-f833f678e3b9)](https://codebeat.co/projects/github-com-tamada-scv-main)

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg?logo=spdx)](https://github.com/tamada/scv/blob/main/LICENSE)
[![Version](https://img.shields.io/badge/Version-1.0.0-blue.svg)](https://github.com/tamada/scv/releases/tag/v1.0.0)

[![Docker](https://img.shields.io/badge/Docker-ghcr.io%2Ftamada%2Fscv%3A1.0.0-green?logo=docker)](https://github.com/users/tamada/packages/container/package/scv)

Similarities and distance Calculator among Vectors.

## :speaking_head: Description

There are several algorithms to calculate the similarities of two bectors; however, no commands are exists treats them.
`scv` standardizes the interface for calculating the similarities and distances among vectors.


## :runner: Usage

### :question: CLI help message

```sh
scv [OPTIONS] <VECTORS...>
OPTIONS
-a, --algorithm <ALGORITHM> specifies the calculating algorithm. This option is mandatory.
The value of this option accepts several values separated with comma.
Available values are: simpson, jaccard, dice, cosine, pearson,
euclidean, manhattan, chebyshev, and levenshtein.
-f, --format <FORMAT> specifies the resultant format. Default is default.
Available values are: default, json, and xml.
-t, --input-type <TYPE> specifies the type of VECTORS. Default is file.
If TYPE is separated with comma, each type shows
the corresponding VECTORS.
Available values are: byte_file, term_file, string, and json.
-h, --help prints this message.
VECTORS
the source of vectors for calculation.
```

## :athletic_shoe: Examples

```sh
$ scv -t string -a simpson distance similarity
simpson(distance, similarity) = 0.5000
$ scv -t string -a jaccard,dice distance similarity
jaccard(distance, similarity) = 0.3333
dice(distance, similarity) = 0.5000
```

### :whale: Docker

[![Docker](https://img.shields.io/badge/Docker-ghcr.io%2Ftamada%2Fscv%3A1.0.0-green?logo=docker)](https://github.com/users/tamada/packages/container/package/scv)

```sh
docker run -it ghcr.io/tamada/scv:latest gives some strings for comparing
```

If `scv` reads some files, `-v` option should be specified.

```sh
docker run -v $PWD:/home/scv -it ghcr.io/tamada/scv:latest -f json testdata/*.json
```

#### versions

- `1.0.0`, `latest`

## :anchor: Install

### :beer: Homebrew

Simply type the following commands.

```
brew tap tamada/brew
brew install scv
```

### Go lang

```
go get github.com/tamada/scv
```

### :muscle: Compile yourself

```
git clone https://github.com/tamada/scv
cd scv
make
```

## :smile: About

### :man_office_worker: Authors :woman_office_worker:

* Haruaki Tamada ([tamada](https://github.com/tamada))

### :scroll: License

[Apache 2.0](https://github.com/tamada/scv/blob/main/LICENSE)

### :jack_o_lantern: Icon

![Icon](https://raw.githubusercontent.com/tamada/scv/main/docs/static/images/scale.png)

This image is obtained from [iconscount.com](https://iconscout.com/icon/scale-217).
31 changes: 0 additions & 31 deletions docs/content/algorithms.md

This file was deleted.

60 changes: 60 additions & 0 deletions docs/content/algorithms.mmark
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
title: ":gem: Algorithms"
---

## :speaking_head: Overview

`scv` supports simpson, jaccard, dice, cosine, pearson, euclidean, manhattan, and chebyshev algorithms.
Those algorithms separated into two categories, similarities, and distances.
The followings describes each algorithms and how to calculate similarity/distance from two vectors.

Before calculating the similarities and/or distance of the following algorithms,
we assume that two vectors were given ($$v = \{ (a_1, b_1), (a_2, b_2), ..., (a_n, b_n) \}$$ and $$w = \{ (c_1, d_1), (c_2, d_2), ..., (c_m, d_m) \}$$).

## Similarities

Calculate similarities among the given vectors.
For this, `scv` computes similarity between two vectors from the given vectors of their combinations.

To describe each algorithm,
Each element of the vector has key and value.

### Simpson Index

$$S = \frac{|\mathrm{intersect}(v, w)|}{\min(|v_1|, |v_2|)}$$

$$\mathrm{intersect}$$ function returns the new vector by common keys and sum of thier values ($$a_i = c_j (1 \leq i \leq n, 1 \leq j \leq m)$$.


### Jaccard index

$$J=\frac{|\mathrm{intersect}(v, w)|}{|\mathrm{union}(v, w)|}$$

$$\mathrm{intersect}$$ function returns the new vector which contains every keys of $$v$$ and $$w$$.

### Dice index

$$D=\frac{2\times|\mathrm{intersect}(v, w)|}{|v| + |w|}$$

### Cosine similarity

$$C=\cos\theta=\frac{v\cdot w}{\sqrt{\sum_{i=0}^{n}b_i^2}\sqrt{\sum_{j=0}^{m}d_j^2}}$$

### Pearson correlation efficiency



## Distances

### Euclidean Distance



### Manhattan Distance



### Chebyshev Distance


### Edit Distance
4 changes: 4 additions & 0 deletions docs/layouts/_default/single.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{{ define "main" }}
<h1>{{ .Title | emojify }}</h1>
{{ .Content }}
{{ end }}
18 changes: 18 additions & 0 deletions docs/layouts/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
{{ define "main" }}
<h1>{{ .Title | emojify }}</h1>

{{ .Content }}

<ul>
{{ $pages := where site.RegularPages "Type" "in" site.Params.mainSections }}
{{ range $pages.ByPublishDate.Reverse }}
<li>
{{ $dateFormat := $.Site.Params.dateFormat | default "Jan 2, 2006" }}
{{ .PublishDate.Format $dateFormat }}
<a href="{{ .Permalink }}">
{{ .Title }}
</a>
</li>
{{ end }}
</ul>
{{ end }}
Loading

0 comments on commit bcb515f

Please sign in to comment.