<h3>下準備</h3>
<ul>
<li>Google Driveのマウント</li>
<li>KaggleのAPIを使うための準備</li>
</ul>

**Google Driveのマウント**

In [None]:
from google.colab import drive
drive.mount('/content/drive')

**KaggleのAPIを使うための準備**

In [None]:
!pip install -q kaggle
!mkdir -p ~/.kaggle
!cp /content/drive/My\ Drive/kaggle/kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json

<h3>データのダウンロード</h3>
<ul>
<li>KaggleのAPIを使ってデータをダウンロード</li>
<li>ダウンロードしたデータを解凍</li>
</ul>

**KaggleのAPIを使ってデータをダウンロード**

In [None]:
!kaggle competitions download -c titanic

**ダウンロードしたデータを解凍**

In [None]:
!unzip train.csv.zip
!unzip test.csv.zip

<h3>データの読み込み</h3>
<ul>
<li>データを読み込む</li>
<li>データの確認</li>
</ul>

**データを読み込む**

In [None]:
import pandas as pd
train = pd.read_csv('train.csv')
test = pd.read_csv('test.csv')

**データの確認**

In [None]:
train.head()
test.head()

<h3>データの前処理</h3>
<ul>
<li>欠損値の補完</li>
<li>カテゴリ変数の処理</li>
<li>特徴量の選択</li>
</ul>

**欠損値の補完**

In [None]:
train['Age'] = train['Age'].fillna(train['Age'].median())
train['Embarked'] = train['Embarked'].fillna('S')
test['Age'] = test['Age'].fillna(test['Age'].median())
test['Fare'] = test['Fare'].fillna(test['Fare'].median())

**カテゴリ変数の処理**

In [None]:
train['Sex'] = train['Sex'].map({'female': 0, 'male': 1}).astype(int)
train['Embarked'] = train['Embarked'].map({'S': 0, 'C': 1, 'Q': 2}).astype(int)
test['Sex'] = test['Sex'].map({'female': 0, 'male': 1}).astype(int)
test['Embarked'] = test['Embarked'].map({'S': 0, 'C': 1, 'Q': 2}).astype(int)

**特徴量の選択**

In [None]:
features = ['Pclass', 'Sex', 'Age', 'SibSp', 'Parch', 'Fare', 'Embarked']
X_train = train[features]
X_test = test[features]
y_train = train['Survived']

<h3>モデルの学習</h3>
<ul>
<li>モデルの定義</li>
<li>モデルの学習</li>
</ul>

**モデルの定義**

In [None]:
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=100, max_depth=5, random_state=1)

**モデルの学習**

In [None]:
model.fit(X_train, y_train)

<h3>予測</h3>
<ul>
<li>予測</li>
</ul>

**予測**

In [None]:
predictions = model.predict(X_test)

<h3>提出</h3>
<ul>
<li>提出用ファイルの作成</li>
<li>提出</li>
</ul>

**提出用ファイルの作成**

In [None]:
output = pd.DataFrame({'PassengerId': test.PassengerId, 'Survived': predictions})
output.to_csv('my_submission.csv', index=False)

**提出**

In [None]:
!kaggle competitions submit -c titanic -f my_submission.csv -m "First submission"

<h3>参考</h3>
<ul>
<li><a href="https://www.kaggle.com/c/titanic">Titanic: Machine Learning from Disaster</a></li>
<li><a href="https://www.kaggle.com/alexisbcook/titanic-tutorial">Titanic Tutorial</a></li>
</ul>

[1]: https://www.kaggle.com/c/titanic
[2]: https://www.kaggle.com/alexisbcook/titanic-tutorial