dir() 函数
dir() 用于列出对象的所有属性和方法名称，返回一个字符串列表。

主要用途：
1. 查看package、类或对象包含哪些属性和方法
2. 在不确定具体方法名时进行探索

使用示例：
```python
import torch
# 查看 torch 下的子模块（package）
print(dir(torch))
# 查看张量对象的所有方法
tensor = torch.tensor([1, 2, 3])
print(dir(tensor))
# 查看特定子模块的内容
print(dir(torch.nn))
```

help() 函数
help() 提供对象的详细文档信息，包括函数签名、参数说明、使用方法等。

主要用途：
1. 获取函数或类的详细文档
2. 了解参数的含义和用法
3. 查看使用示例和注意事项

```python
import torch

# 获取torch.tensor函数的帮助文档
help(torch.tensor)

# # 获取特定方法的帮助
help(torch.cat)

# # 获取类的帮助文档
# help(torch.nn.Linear)
```

感觉没什么用 #TODO

Dataset



Dataloader




数据组织形式

train
    - label1
      - file1
      - file2
      - ...
    - label2
      - file3
      - file4
      - ...

folder
    - img
      - file1
      - file2
    - label
      - file1_label
      - file2_label


In [4]:
from torch.utils.data import Dataset
from PIL import Image

Dataset 为一个抽象类

所有表示从键映射到数据样本的数据集都应该继承此类
所有子类必须重写 `__getitem__` 方法，以支持根据给定键获取数据样本
可选的重写 `__len__` 方法，返回数据集的大小

In [None]:
# help(Dataset.__getitem__)

Help on function __getitem__ in module torch.utils.data.dataset:

__getitem__(self, index) -> +T_co



In [None]:
class MyData(Dataset):
    def __init__(self):
        super().__init__()

    def __getitem__(self, idx):
        img_path = ""

Help on class Dataset in module torch.utils.data.dataset:

class Dataset(typing.Generic)
 |  Dataset(*args, **kwds)
 |  
 |  An abstract class representing a :class:`Dataset`.
 |  
 |  All datasets that represent a map from keys to data samples should subclass
 |  it. All subclasses should overwrite :meth:`__getitem__`, supporting fetching a
 |  data sample for a given key. Subclasses could also optionally overwrite
 |  :meth:`__len__`, which is expected to return the size of the dataset by many
 |  :class:`~torch.utils.data.Sampler` implementations and the default options
 |  of :class:`~torch.utils.data.DataLoader`.
 |  
 |  .. note::
 |    :class:`~torch.utils.data.DataLoader` by default constructs a index
 |    sampler that yields integral indices.  To make it work with a map-style
 |    dataset with non-integral indices/keys, a custom sampler must be provided.
 |  
 |  Method resolution order:
 |      Dataset
 |      typing.Generic
 |      builtins.object
 |  
 |  Methods defined 

PyTorch 的数据读入通过 Dataset + DataLoader 实现

其中，Dataset 定义好数据格式和数据变换的形式；DataLoader 用 iterative 的方式不断读入批次数据

针对自己的数据集，需要继承官方的 Dataset 类，同时定义三个函数：
1. `__init__`：向类中传入外部参数，定义样本集？
2. `__getitem__`：逐个读取样本集合中的元素，返回训练/验证所需的数据
3. `__len__`：返回数据集的样本数

In [11]:
import os
import numpy as np
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
import torch.optim as optimizer
from torchvision import datasets
import pandas as pd

In [None]:
class MyDataset(Dataset):
    def __init__(self, data_dir, info_csv, image_list, transform=None):
        """
        Args:
            data_dir: path to image directory.
            info_csv: path to the csv file containing image indexes
                with corresponding labels.
            image_list: path to the txt file contains image names to training/validation set
            transform: optional transform to be applied on a sample.
        """
        label_info = pd.read_csv(info_csv)
        image_file = open(image_list).readlines()
        self.data_dir = data_dir
        self.image_file = image_file
        self.label_info = label_info
        self.transform = transform

    def __getitem__(self, index):
        """
        Args:
            index: the index of item
        Returns:
            image and its labels
        """
        image_name = self.image_file[index].strip('\n')
        raw_label = self.label_info.loc[self.label_info['Image_index'] == image_name]
        label = raw_label.iloc[:, 0]
        image_name = os.path.join(self.data_dir, image_name)
        image = Image.open(image_name).convert('RGB')
        if self.transform is not None:
            image = self.transform(image)
        return image, label

    def __len__(self):
        return len(self.image_file)

NameError: name 'train_path' is not defined

# 张量

## 创建

In [None]:
import torch

| 函数 | 说明 |
|--|--|
|  |  |

In [None]:
x = torch.rand(4, 3)
print(x)

tensor([[0.6182, 0.8133, 0.9775],
        [0.6920, 0.0324, 0.0588],
        [0.9815, 0.6694, 0.3969],
        [0.2472, 0.4335, 0.4964]])


In [None]:
torch.