# TensorFlow
- TensorFlow とは機械学習のアルゴリズムを実装して実行するためのプログラミングインターフェースである。
- TensorFlow は一連のノードからなる計算グラフに基づいている。各ノードは0個以上の入力または出力をもつ演算を表す。これらの演算の入力と出力を参照するシンボリックハンドルとしてテンソルが作成される

## テンソルの作成

In [2]:
import tensorflow as tf
import numpy as np
np.set_printoptions(precision=3)

2024-10-16 21:11:40.602692: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-10-16 21:11:40.812191: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-16 21:11:40.902933: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-10-16 21:11:40.935029: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-10-16 21:11:41.095592: I tensorflow/core/platform/cpu_feature_guar

In [3]:
# np array を作成する
a = np.array([1, 2, 3], dtype=np.int32)
# 普通に list を作成する
b = [4, 5, 6]

In [5]:
# どちらからもコンバートすることができる
t_a = tf.convert_to_tensor(a)
t_b = tf.convert_to_tensor(b)
print(t_a)
print(t_b)
print(t_a.shape)

tf.Tensor([1 2 3], shape=(3,), dtype=int32)
tf.Tensor([4 5 6], shape=(3,), dtype=int32)
(3,)


In [6]:
t_ones = tf.ones((2, 3))
t_ones.numpy()

array([[1., 1., 1.],
       [1., 1., 1.]], dtype=float32)

In [7]:
# constant で直接作成することもできる（convert_to_tensorとどう違うのか）
c_tensor = tf.constant([1.2, 5, np.pi], dtype=tf.float32)
print(c_tensor)

tf.Tensor([1.2   5.    3.142], shape=(3,), dtype=float32)


## TensorFlow Dataset API
効率的で便利な前処理パイプラインを構築するためのクラスがあり、以下の必要性にこたえている。
- メモリに乗らない大きいデータセットをバッチ（ミニバッチ）ごとに読み込む必要性
- データに特定の変換や前処理を適用する必要性

バッチごとに学習できるのはDNNが反復的な最適化アルゴリズム（確率的勾配降下法など）を使ってモデルを増分的に訓練しているからである。

データセットの作成は大きく3つある。
- 既存のテンソルから作成
- ローカルのファイルから作成
- tensorflow_datasets ライブラリから取り出す

## 既存のテンソルから作成

In [4]:
a = [1.2, 3.4, 7.5, 4.1, 5.0, 1.0]
ds = tf.data.Dataset.from_tensor_slices(a)
print(ds)

<_TensorSliceDataset element_spec=TensorSpec(shape=(), dtype=tf.float32, name=None)>


In [7]:
ds_batch = ds.batch(3) # 3要素ずつのミニバッチを作成する
for i, elem in enumerate(ds_batch, 1):
    print(f'batch {i}, {elem.numpy()}')

batch 1, [1.2 3.4 7.5]
batch 2, [4.1 5.  1. ]


## ローカルファイルから作成

In [8]:
# 省略

## tensorflow_datasets ライブラリから取り出す

In [13]:
import tensorflow_datasets as tfds

In [14]:
print(len(tfds.list_builders()))

2024-10-16 21:21:30.120685: W external/local_tsl/tsl/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "NOT_FOUND: Could not locate the credentials file.". Retrieving token from GCE failed with "FAILED_PRECONDITION: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Could not resolve host: metadata.google.internal".


1291


In [15]:
celeba_bldr = tfds.builder('celeb_a')
print(celeba_bldr.info.features)

FeaturesDict({
    'attributes': FeaturesDict({
        '5_o_Clock_Shadow': bool,
        'Arched_Eyebrows': bool,
        'Attractive': bool,
        'Bags_Under_Eyes': bool,
        'Bald': bool,
        'Bangs': bool,
        'Big_Lips': bool,
        'Big_Nose': bool,
        'Black_Hair': bool,
        'Blond_Hair': bool,
        'Blurry': bool,
        'Brown_Hair': bool,
        'Bushy_Eyebrows': bool,
        'Chubby': bool,
        'Double_Chin': bool,
        'Eyeglasses': bool,
        'Goatee': bool,
        'Gray_Hair': bool,
        'Heavy_Makeup': bool,
        'High_Cheekbones': bool,
        'Male': bool,
        'Mouth_Slightly_Open': bool,
        'Mustache': bool,
        'Narrow_Eyes': bool,
        'No_Beard': bool,
        'Oval_Face': bool,
        'Pale_Skin': bool,
        'Pointy_Nose': bool,
        'Receding_Hairline': bool,
        'Rosy_Cheeks': bool,
        'Sideburns': bool,
        'Smiling': bool,
        'Straight_Hair': bool,
        'Wavy_Hair': b

In [17]:
print(celeba_bldr.info.features['image'])
print(celeba_bldr.info.features['attributes'].keys())
print(celeba_bldr.info.citation)

Image(shape=(218, 178, 3), dtype=uint8)
dict_keys(['5_o_Clock_Shadow', 'Arched_Eyebrows', 'Attractive', 'Bags_Under_Eyes', 'Bald', 'Bangs', 'Big_Lips', 'Big_Nose', 'Black_Hair', 'Blond_Hair', 'Blurry', 'Brown_Hair', 'Bushy_Eyebrows', 'Chubby', 'Double_Chin', 'Eyeglasses', 'Goatee', 'Gray_Hair', 'Heavy_Makeup', 'High_Cheekbones', 'Male', 'Mouth_Slightly_Open', 'Mustache', 'Narrow_Eyes', 'No_Beard', 'Oval_Face', 'Pale_Skin', 'Pointy_Nose', 'Receding_Hairline', 'Rosy_Cheeks', 'Sideburns', 'Smiling', 'Straight_Hair', 'Wavy_Hair', 'Wearing_Earrings', 'Wearing_Hat', 'Wearing_Lipstick', 'Wearing_Necklace', 'Wearing_Necktie', 'Young'])
@inproceedings{conf/iccv/LiuLWT15,
  added-at = {2018-10-09T00:00:00.000+0200},
  author = {Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou},
  biburl = {https://www.bibsonomy.org/bibtex/250e4959be61db325d2f02c1d8cd7bfbb/dblp},
  booktitle = {ICCV},
  crossref = {conf/iccv/2015},
  ee = {http://doi.ieeecomputersociety.org/10.1109/ICCV.2015.425},
  i

In [19]:
celeba_bldr.download_and_prepare()
datasets = celeba_bldr.as_dataset(shuffle_files=False)
datasets.keys()

[1mDownloading and preparing dataset 1.39 GiB (download: 1.39 GiB, generated: 1.63 GiB, total: 3.01 GiB) to /home/hoyo/tensorflow_datasets/celeb_a/2.1.0...[0m


Dl Size...: 100%|██████████| 45137925/45137925 [00:00<00:00, 69700965.01 MiB/s]
Dl Completed...: 100%|██████████| 5/5 [00:00<00:00,  7.68 url/s]


NonMatchingChecksumError: Artifact https://drive.google.com/uc?export=download&id=0B7EVK8r0v71pZjFTYXZWM3FlRnM, downloaded to /home/hoyo/tensorflow_datasets/downloads/ucexport_download_id_0B7EVK8r0v71pZjFTYXZWM3FlDDaXUAQO8EGH_a7VqGNLRtW52mva1LzDrb-V723OQN8.tmp.370418b583e4436f9185e8a11d8a3b87/download, has wrong checksum:
* Expected: UrlInfo(size=1.34 GiB, checksum='46fb89443c578308acf364d7d379fe1b9efb793042c0af734b6112e4fd3a8c74', filename='img_align_celeba.zip')
* Got: UrlInfo(size=2.37 KiB, checksum='1b86097ad19a8cb93cdd7558afae77a7c3db6b7cc9e88e9acd0965d9c0594590', filename='download')
To debug, see: https://www.tensorflow.org/datasets/overview#fixing_nonmatchingchecksumerror

https://qiita.com/takkeybook/items/358e57f0706367e83be6 で解決可能だが面倒なので省略