# **README**

## **開発環境**
本リポジトリは下記のような構成にて開発しています。

またこのノートブックは、図のような開発環境を構築する手順を記載してます。
（編集中）

![代替テキスト](https://drive.google.com/uc?id=1Hy35Mdicyxx7Q3riCsfwoh7bHCrE-JJc)

## **設定方法**

# **Colabの初期設定**
`Kaggle`データセットのダウンロードはここから`GoogleDrive`上に行う。

ダウンロード後の分析作業は、各ノートブック`.ipynb`を作成して行う。

その際、`GoogleDrive`のマウントは各ノート上で実行する。

下記は、`KaggleAPI`の有効化、`GoogleDrive`のマウント化、`Tree`モジュールの追加、`Kaggle`データセットのダウンロード、githubとの連携～1st commitまでの実施手順。

## **`Kaggle API`のインストール**

In [0]:
!pip install kaggle

## **`Kaggle API`の有効化**

[Kaggle API with Colab](https://colab.research.google.com/drive/1eufc8aNCdjHbrBhuy7M7X6BGyzAyRbrF#scrollTo=5l1V_oxXsZ8l&forceEdit=true&sandboxMode=true)

下記実行前に、`kaggle.json`をあらかじめDLし、`GoogleDrive`に格納しておく。実行すると認証設定が呼び出され、許可すると`GoogleDrive`ディレクトリ内から`kaggle.json`ファイルが検索され、`root/.kaggle`以下に格納される。
元々のコードだと

`filename = "/content/.kaggle/kaggle.json"`

となっているが、API起動時に参照エラーが発生するため

`filename = "/root/.kaggle/kaggle.json"`

へ変更する事。


In [0]:
from googleapiclient.discovery import build
import io, os
from googleapiclient.http import MediaIoBaseDownload
from google.colab import auth

auth.authenticate_user()

drive_service = build('drive', 'v3')
results = drive_service.files().list(
        q="name = 'kaggle.json'", fields="files(id)").execute()
kaggle_api_key = results.get('files', [])

filename = "/root/.kaggle/kaggle.json"
os.makedirs(os.path.dirname(filename), exist_ok=True)

request = drive_service.files().get_media(fileId=kaggle_api_key[0]['id'])
fh = io.FileIO(filename, 'wb')
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
    status, done = downloader.next_chunk()
    print("Download %d%%." % int(status.progress() * 100))
os.chmod(filename, 600)

Download 100%.


## **`GoogleDrive`のマウント**
先に`GoogleDrive`を`Colab上へ`マウントした場合、`Googledrive`上の`kaggle.json`内の記載が空白化する事象が発生したため、`KaggleAPI`導入後に実施

In [0]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


## **`Tree`パッケージのインストール**

ディレクトリ構成の記載に便利なため導入

In [0]:
!apt-get install tree

## ディレクトリ構成

`My Drive`以下は、`GoogleDrive`のマウント先。データを保存することで、
`GoogleDrive`に同期される。

`Colab Notebooks`：Notebookの格納

`datasets`：各データセットの格納

`setting`：設定ファイルの格納

In [0]:
!tree -d

.
├── Colab Notebooks
├── datasets
│   ├── kaggle
│   │   └── titanic
│   └── ml100knock
│       ├── 10_Questionnaire_analysis
│       ├── 1_web_order
│       ├── 2_Retail_data
│       ├── 3_Customer_information
│       ├── 4_Customer_behavior
│       ├── 5_Customer_withdrawal
│       ├── 6_Logistics_route
│       ├── 7_Logistics_network
│       ├── 8_Numerical_simulation
│       └── 9_Potential_customer
│           ├── img
│           └── mov
├── imgs
└── setting
    └── __pycache__

20 directories


## **`Kaggle`データセットのダウンロード**

使用するデータセットは`./drive/My\ Drive/datasets/kaggle/{competition title}/`以下に格納

`> !kaggle competitions download -h`

```
usage: kaggle competitions download [-h] [-f FILE_NAME] [-p PATH] [-w] [-o]
                                    [-q]
                                    [competition]

optional arguments:
  -h, --help            show this help message and exit
  competition           Competition URL suffix (use "kaggle competitions list" to show options)
                        If empty, the default competition will be used (use "kaggle config set competition")"
  -f FILE_NAME, --file FILE_NAME
                        File name, all files downloaded if not provided
                        (use "kaggle competitions files -c <competition>" to show options)
  -p PATH, --path PATH  Folder where file(s) will be downloaded, defaults to current working directory
  -w, --wp              Download files to current working path
  -o, --force           Skip check whether local version of file is up to date, force file download
  -q, --quiet           Suppress printing information about the upload/download progress
```






In [0]:
!kaggle competitions download -c titanic -p ./drive/My\ Drive/githyb/datasets/kaggle/titanic/

## Githubとの連携
githubとcolabの連携は、[personal token](https://help.github.com/ja/github/authenticating-to-github/creating-a-personal-access-token-for-the-command-line)と[https URL](https://help.github.com/ja/github/using-git/which-remote-url-should-i-use#cloning-with-https-urls-recommended)を用いて行う。
[参考](https://towardsdatascience.com/google-drive-google-colab-github-dont-just-read-do-it-5554d5824228)

モジュール検索パスの追加

In [0]:
import sys
from os.path import join

REPO_NAME = 'remote-colab'
PROJECT_PATH = '/content/drive/My Drive/'+ REPO_NAME + '/'
sys.path.append(PROJECT_PATH)

設定ファイルのインポート

In [0]:
from setting import personal_setting as PS
# PS.email_address = {'your setting e-mail address'}
# PS.personal_token = {'your token'}
# PS.user_name = {'your name'}

Clone URL・プロジェクトディレクトリの作成

In [0]:
GIT_PATH = "https://" + PS.personal_token + "@github.com/" + PS.user_name + "/" + REPO_NAME + ".git"
print("GIT_PATH: ", GIT_PATH)

# プロジェクトディレクトリの作成
!mkdir "{PROJECT_PATH}"
!cd "{PROJECT_PATH}"

クローン

In [0]:
!git clone "{GIT_PATH}"

差分の更新・コミット・更新

`Shell`スクリプト上で`python`の変数`hoge`を利用する場合
`"{hoge}"`とすると利用できるみたい。便利。

In [0]:
!git add -A
!git config --global user.email "{PS.email_address}"
!git config --global user.name "{PS.user_name}"

アクセストークンが`commit`するファイル内に含まれる場合、`github`が検知して
アクセストークンが無効化される。

そのため、`.gitignore`でファイルを追跡しないように設定する。



```
# .gitignore
setting/*.py # personal_settingを記述しているため、追跡から除外
*.json
*.csv
.git/* #.git/config内に同様の内容が含まれるため、除外（デフォルトで除外される？）
```



In [142]:
!git commit -m 'some fixes'

[master 63cdd38] some fixes
 2 files changed, 296 insertions(+), 1 deletion(-)
 create mode 100644 setting.md


In [143]:
!git push origin master

To https://github.com/otompton/remote-colab.git
 ! [rejected]        master -> master (fetch first)
error: failed to push some refs to 'https://e9559d45a68113e51d7d21858b88275318f8b362@github.com/otompton/remote-colab.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.


In [144]:
!git remote update -p

Fetching origin
remote: Enumerating objects: 13, done.[K
remote: Counting objects:   7% (1/13)[Kremote: Counting objects:  15% (2/13)[Kremote: Counting objects:  23% (3/13)[Kremote: Counting objects:  30% (4/13)[Kremote: Counting objects:  38% (5/13)[Kremote: Counting objects:  46% (6/13)[Kremote: Counting objects:  53% (7/13)[Kremote: Counting objects:  61% (8/13)[Kremote: Counting objects:  69% (9/13)[Kremote: Counting objects:  76% (10/13)[Kremote: Counting objects:  84% (11/13)[Kremote: Counting objects:  92% (12/13)[Kremote: Counting objects: 100% (13/13)[Kremote: Counting objects: 100% (13/13), done.[K
remote: Compressing objects: 100% (12/12), done.[K
remote: Total 12 (delta 6), reused 0 (delta 0), pack-reused 0[K
Unpacking objects: 100% (12/12), done.
From https://github.com/otompton/remote-colab
   29e4e59..f2b10d0  master     -> origin/master


In [146]:
!git branch

* [32mmaster[m


In [147]:
!git fetch origin master

From https://github.com/otompton/remote-colab
 * branch            master     -> FETCH_HEAD


In [149]:
!git status

On branch master
Your branch and 'origin/master' have diverged,
and have 3 and 5 different commits each, respectively.
  (use "git pull" to merge the remote branch into yours)

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	[31mmodified:   setting.ipynb[m

no changes added to commit (use "git add" and/or "git commit -a")


In [150]:
!git show

[33mcommit 63cdd38716c9290f7bb760280ac59f41185981f6[m[33m ([m[1;36mHEAD -> [m[1;32mmaster[m[33m)[m
Author: {PS.user_name} <{PS.email_address}>
Date:   Mon Dec 30 12:33:38 2019 +0000

    some fixes

[1mdiff --git a/Colab Notebooks/setting.ipynb b/Colab Notebooks/setting.ipynb[m
[1mindex 3ed8a6d..50e5775 100644[m
[1m--- a/Colab Notebooks/setting.ipynb[m	
[1m+++ b/Colab Notebooks/setting.ipynb[m	
[36m@@ -1 +1 @@[m
[31m-{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"setting.ipynb","provenance":[],"collapsed_sections":[]},"kernelspec":{"name":"python3","display_name":"Python 3"}},"cells":[{"cell_type":"markdown","metadata":{"id":"GeKa8A76fnxS","colab_type":"text"},"source":["# **README**"]},{"cell_type":"markdown","metadata":{"id":"AFkyDeq5mMvr","colab_type":"text"},"source":["## **開発環境**\n","本リポジトリは下記のような構成にて開発しています。\n","\n","またこのノートブックは、図のような開発環境を構築する手順を記載してます。\n","（編集中）"]},{"cell_type":"markdown","metadata":{"id":"yscpdPwLfo_x","colab_type":"text"},"s

In [151]:
!git merge FETCH_HEAD

hint: Waiting for your editor to close the file... error: unable to start editor 'editor'
Not committing merge; use 'git commit' to complete the merge.


In [152]:
!git commit -m 'some fixes'

[master 10c4c7b] some fixes


In [153]:
!git status

On branch master
Your branch is ahead of 'origin/master' by 4 commits.
  (use "git push" to publish your local commits)

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	[31mmodified:   setting.ipynb[m

no changes added to commit (use "git add" and/or "git commit -a")


In [0]:
!git add -A

In [155]:
!git commit -m 'some fixes'

[master 5f0a7f8] some fixes
 1 file changed, 1 insertion(+), 1 deletion(-)


In [156]:
!git push origin master

Counting objects: 16, done.
Delta compression using up to 2 threads.
Compressing objects:   6% (1/16)   Compressing objects:  12% (2/16)   Compressing objects:  18% (3/16)   Compressing objects:  25% (4/16)   Compressing objects:  31% (5/16)   Compressing objects:  37% (6/16)   Compressing objects:  43% (7/16)   Compressing objects:  50% (8/16)   Compressing objects:  56% (9/16)   Compressing objects:  62% (10/16)   Compressing objects:  68% (11/16)   Compressing objects:  75% (12/16)   Compressing objects:  81% (13/16)   Compressing objects:  87% (14/16)   Compressing objects:  93% (15/16)   Compressing objects: 100% (16/16)   Compressing objects: 100% (16/16), done.
Writing objects:   6% (1/16)   Writing objects:  12% (2/16)   Writing objects:  18% (3/16)   Writing objects:  25% (4/16)   Writing objects:  31% (5/16)   Writing objects:  37% (6/16)   Writing objects:  43% (7/16)   Writing objects:  50% (8/16)   Writing objects:  56% (9/16)   Writing objects:  6

In [157]:
!git diff

[1mdiff --git a/Colab Notebooks/setting.ipynb b/Colab Notebooks/setting.ipynb[m
[1mindex f7652c4..68c88a7 100644[m
[1m--- a/Colab Notebooks/setting.ipynb[m	
[1m+++ b/Colab Notebooks/setting.ipynb[m	
[36m@@ -1 +1 @@[m
[31m-{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"setting.ipynb","provenance":[],"collapsed_sections":[]},"kernelspec":{"name":"python3","display_name":"Python 3"}},"cells":[{"cell_type":"markdown","metadata":{"id":"GeKa8A76fnxS","colab_type":"text"},"source":["# **README**"]},{"cell_type":"markdown","metadata":{"id":"AFkyDeq5mMvr","colab_type":"text"},"source":["## **開発環境**\n","本リポジトリは下記のような構成にて開発しています。\n","\n","またこのノートブックは、図のような開発環境を構築する手順を記載してます。\n","（編集中）"]},{"cell_type":"markdown","metadata":{"id":"yscpdPwLfo_x","colab_type":"text"},"source":["![代替テキスト](https://drive.google.com/uc?id=1Hy35Mdicyxx7Q3riCsfwoh7bHCrE-JJc)"]},{"cell_type":"markdown","metadata":{"id":"XiSJnFkyfpMd","colab_type":"text"},"source":["## **設定方法**"]},{"cell_type":"mark