# momijiame/gokinjo

gokinjo: A feature extraction library based on k-nearest neighbor algorithm in Python
Latest commit 5ab16e3 Feb 8, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
.circleci Feb 7, 2019
examples
gokinjo Feb 7, 2019
tests Feb 7, 2019
.gitignore
entry_points.cfg Feb 7, 2019
setup.cfg Feb 7, 2019
setup.py
tox.ini Feb 7, 2019

# gokinjo

### What is this?

• A feature extraction library based on k-nearest neighbor algorithm in Python
• k-NN based feature has experience of being used on 1st place solution of Kaggle competition (see references)
• Be able to switch backend of k-NN algorithm
• FYI: "gokinjo" is meant neighborhood in japanese.

### Prerequisite

• Python 3.6 or later
• setuptools >= 30.0.3.0

### How to install

#### From PyPI

`\$ pip install gokinjo`
##### With annoy backend
`\$ pip install "gokinjo[annoy]"`

#### From source code

`\$ pip install git+https://github.com/momijiame/gokinjo.git`

### Quick start

step 1: generate example data

```import numpy as np
x0 = np.random.rand(500) - 0.5
x1 = np.random.rand(500) - 0.5
X = np.array(list(zip(x0, x1)))
y = np.array([1 if i0 * i1 > 0 else 0 for i0, i1 in X])```

step 2: plot the above

```from matplotlib import pyplot as plt
plt.scatter(X[:, 0], X[:, 1], c=y)
plt.show()```

It is not linearly separable obviously.

step 3: extract k-NN feature with K-Fold

```from gokinjo import knn_kfold_extract
X_knn = knn_kfold_extract(X, y)```

step 4: plot the above

```plt.scatter(X_knn[:, 0], X_knn[:, 1], c=y)
plt.show()```

It looks like almost linearly separable.

### Usage example

• Please see examples in GitHub repository.

### How to setup a development environment

```\$ pip install -e ".[develop]"
\$ pytest```

### References

• The competition which k-NN feature was used on 1st place solution
• R implementation
• Super respectable another Python implementation