<a href="https://colab.research.google.com/github/hank199599/deep_learning_keras_log/blob/main/Chapter3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 3-1 核心元件
![3-1](https://github.com/hank199599/deep_learning_keras_log/blob/main/pictures/3-1.png?raw=true)  
* 層(layers)：組成一個神經網路模型
* 輸入資料(input data) 與 目標(target)：用來訓練及檢測一個神經網路
* 損失函數(loss functions)：取得學習的回饋信號
* 優化器(optimizer)：決定學習進行的方式

## 層(layers)
|向量資料|儲存的張量|使用的層|
|---|---|---|
|1D|2D|densely-connected layer 密集連接層
|2D|3D|recurrent layer 循環層
|3D|4D|Conv2D 2D卷積層  
  
須包含一層batch 維度，因此在儲存時的維度較訓練資料多一維

## 模型
神經網路拓樸定義了一個**假設空間(hypothesis space，只在該神經網路拓樸下權重參數所有可能的組態)**

## 損失函數 (loss function)
* 二元分類問題：二元交叉熵(binary crossentropy)
* 多類別分類問題：分類交叉熵(categorical crossentropy)
* 迴歸問題：均方差(meansquared error)
* 序列學習問題：連結時序問題(connectionist temporal classfication)


# 3-4 二元分類：電影評論為正評或負評
[IMDb,Internet Movie Database](https://www.imdb.com)

載入IMBD資料集

In [5]:
from keras.datasets import imdb

(train_data,train_labels),(test_data,test_labels) = imdb.load_data(num_words=10000)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz


  x_train, y_train = np.array(xs[:idx]), np.array(labels[:idx])
  x_test, y_test = np.array(xs[idx:]), np.array(labels[idx:])


### num_words
表示讀取資料時，只允許「單字對應數字的字典」中編號0~9999的單字載入

In [6]:
max([max(sequence) for sequence in train_data])

9999

將數字還原成文字

In [7]:
word_index = imdb.get_word_index() #取得單字對應數字的字典
reverse_word_index = dict(
    [(value,key) for (key,value) in word_index.items()] #反轉為數字對應單字的字典
)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb_word_index.json


In [8]:
decord_review = ' '.join([reverse_word_index.get(i-3,'?') for i in train_data[0]])
decord_review

"? this film was just brilliant casting location scenery story direction everyone's really suited the part they played and you could just imagine being there robert ? is an amazing actor and now the same being director ? father came from the same scottish island as myself so i loved the fact there was a real connection with this film the witty remarks throughout the film were great it was just brilliant so much that i bought the film as soon as it was released for ? and would recommend it to everyone to watch and the fly fishing was amazing really cried at the end it was so sad and you know what they say if you cry at a film it must have been good and this definitely was also ? to the two little boy's that played the ? of norman and paul they were just brilliant children are often left out of the ? list i think because the stars that play them all grown up are such a big profile for the whole film but these children are amazing and should be praised for what they have done don't you th

## 準備資料
方法一：
1. 填補資料中每個子list的內容，使它們具有相同的長度
2. 在將整筆資料轉換成shape (樣本數、填補後的樣本長度)
3. 輸入到崁入層(Embedding layer)
  
方法二： 
1. 對資料中每個子list做[One-shot編碼](https://colab.research.google.com/github/hank199599/data_science_from_scratch_reading_log/blob/main/Chapter19.ipynb#scrollTo=QzUQbu-DtfQa)，將其轉換成由0與1組成的向量

```python
def one_hot_encode(i:int,num_labels:int=10)->List[float]:
  return [1.0 if j==i else 0.0 for j in range(num_labels)]
```
2. 輸入到可處理浮點數的密集層(Dense layer)



In [14]:
import numpy as np

def vectorize_sequences(sequences,dimension=10000):
  results = np.zeros((len(sequences),dimension))
  for i,sequences in enumerate(sequences):
    results[i,sequences] = 1.
  return results

### 將訓練資料轉換為維度為10000的向量

In [15]:
x_train = vectorize_sequences(train_data) # 將訓練資料向量化

In [16]:
x_test = vectorize_sequences(test_data)

In [17]:
x_train[0]

array([0., 1., 1., ..., 0., 0., 0.])

### 將標籤資料向量化

In [19]:
y_train = np.array(train_labels).astype('float32')

In [20]:
y_test = np.array(test_labels).astype('float32')

## 建立神經網路

### 建立密集層Dense堆疊結構，需確認的關鍵：
* 需要使用多少層?
* 每一層要有多少個神經元?

In [None]:
from keras import models
from keras import layers

model = models.Sequential()
model.add(layers.Dense(16,activation='relu',input_shape=(10000,)))
model.add(layers.Dense(16,activation='relu'))
model.add(layers.Dense(16,activation='sigmoid'))

# 3-5 多元分類：數位新聞專欄

# 3-6 迴歸：預測房價