请点击[此处](https://ai.baidu.com/docs#/AIStudio_Project_Notebook/a38e5576)查看本环境基本用法.  <br>
Please click [here ](https://ai.baidu.com/docs#/AIStudio_Project_Notebook/a38e5576) for more detailed instructions. 

![download.png](attachment:b338b289-fa6c-42f6-9a33-cfa0076210e5.png)# 试题说明

## 任务说明
基于Caltech数据集的图像分类，Caltech101包含102个类，每种类别大约40到800个图像，训练集总计7999图像。本次试题需要图片为输入，通过课程学习的分类方法（支持向量机、深度神经网络、卷积神经网络等）从中识别该图像属于哪一个类别。

![image.png](attachment:68e40a79-fd1f-4a4d-93c7-0973013b7ec0.png)

## 数据说明
images下存储所有的训练+测试图像，trian.txt中存储用于训练图像路径和对应标注，图片路径+\t+标签，test.txt中存储测试图像。


![image.png](attachment:68e40a79-fd1f-4a4d-93c7-0973013b7ec0.png)
![image.png](attachment:84ae546f-408b-4788-bfad-bd71a5b7d4ae.png)


## 提交答案
考试提交，需要提交模型代码项目版本和结果文件。结果文件为TXT文件格式，命名为result.txt，文件内的字段需要按照指定格式写入。
结果文件要求：
1,每一行为： 图像名\t标签  101_0073.jpg\t13
2.输出结果应检查是否为1145行数据，否则成绩无效。
3.输出结果文件命名为result.txt，一行一个数据，样例如下：


![image.png](attachment:1e712e43-84d3-4b44-93f9-22228cabe1b4.png)


In [1]:
import os
import zipfile
import paddle
import numpy as np 
from PIL import Image
import random


In [22]:
train_parameters={
    "input_size":[3,80,80],
    "class_dim":102,
    "src_path":"data/data146107/dataset.zip",
    "target_path":"data/data146107/dataset",
    "dataset_path":"data/data146107/dataset/dataset",
    "train_list_path":"data/data146107/dataset/dataset/x.txt",
    "eval_list_path":"data/data146107/dataset/dataset/y.txt",
    "img_path":"data/data146107/dataset/dataset/images",
    "test_list_path":"data/data146107/dataset/dataset/test.txt",
    "num_epochs":200,
    "train_batch_size":32,
    "learning_strategy":{
        "lr":0.001
    }
}

In [3]:
def unzip_data(src_path,target_path):
    if os.path.isdir(target_path):
        print("文件已解压")
    else:
        z=zipfile.ZipFile(src_path)
        z.extractall(path=target_path)
        z.close()

In [4]:
unzip_data(train_parameters['src_path'],train_parameters['target_path'])

In [5]:
def sort_train_list(dataset_path,train_path,eval_path):
    
    path=os.path.join(dataset_path,'train.txt')
    data_list=[]
    with open(path,'r',encoding='utf-8') as f:
        for line in f.readlines():
            img_path,label=line.strip().split('\t')
            data_list.append((img_path,int(label)))
    
    data_list=sorted(data_list,key=lambda x:x[1])

    path=os.path.join(dataset_path,'sorted.txt')
    
    train_list=[]
    eval_list=[]
    cnt=0
    with open(path,'w') as f:
        for line in data_list:
            f.write(line[0]+'\t'+str(line[1])+'\n')
            cnt+=1
            if cnt%10==0:
                eval_list.append(line[0]+'\t'+str(line[1])+'\n')
            else:
                train_list.append(line[0]+'\t'+str(line[1])+'\n')
    
    random.shuffle(train_list)
    with open(train_path,'w') as f:
        for line in train_list:
            f.write(line)

    with open(eval_path,'w') as f:
        for line in eval_list:
            f.write(line)
         

In [6]:
sort_train_list(train_parameters['dataset_path'],train_parameters['train_list_path'],train_parameters['eval_list_path'])

In [7]:
def calmin_size(path):
    path=os.path.join(path,"images")
    imgs=os.listdir(path)
    min_size=999
    for img in imgs:
        img_path=os.path.join(path,img)
        img=Image.open(img_path)
        # print(img.size)
        for num in img.size:
            if num<min_size:
                min_size=num
                print(img_path)
    return min_size


In [8]:
calmin_size(train_parameters['dataset_path'])

data/data146107/dataset/dataset/images/3059.jpg
data/data146107/dataset/dataset/images/3059.jpg
data/data146107/dataset/dataset/images/4890.jpg
data/data146107/dataset/dataset/images/5566.jpg
data/data146107/dataset/dataset/images/9109.jpg
data/data146107/dataset/dataset/images/6916.jpg
data/data146107/dataset/dataset/images/3402.jpg
data/data146107/dataset/dataset/images/477.jpg
data/data146107/dataset/dataset/images/6670.jpg


80

In [24]:
class FoodData(paddle.io.Dataset):

    def __init__(self,path,imgdata_path):
        super().__init__()
        self.data_list=[]
        with open(path,'r',encoding='utf-8') as f:
            for line in f.readlines():
                line=line.strip().split('\t')
                if len(line)>=2:
                    img_path,label=line
                else:
                    img_path=line[0]
                    label=103
                self.data_list.append((os.path.join(imgdata_path,img_path),int(label)))

    def __getitem__(self,index):
        img_path,label=self.data_list[index]
        img=Image.open(img_path)
        if img.mode!= 'RGB':
            img=img.convert("RGB")
        img=img.resize((80,80),Image.BILINEAR)
        img=np.array(img).astype('float32')
        img=img.transpose((2,0,1))/255.0
        return img,np.array([label],dtype='int64')

    def __len__(self):
        return len(self.data_list)

In [10]:
train_data=FoodData(train_parameters['train_list_path'],train_parameters['img_path'])
eval_data=FoodData(train_parameters['eval_list_path'],train_parameters['img_path'])

In [11]:
train_data.__getitem__(10)

(array([[[0.3529412 , 0.1882353 , 0.23137255, ..., 0.6       ,
          0.6784314 , 0.5529412 ],
         [0.36078432, 0.15686275, 0.27450982, ..., 0.6862745 ,
          0.65882355, 0.5372549 ],
         [0.41568628, 0.24313726, 0.30980393, ..., 0.7058824 ,
          0.7019608 , 0.47058824],
         ...,
         [0.41568628, 0.44313726, 0.5176471 , ..., 0.69803923,
          0.58431375, 0.6627451 ],
         [0.43529412, 0.56078434, 0.5294118 , ..., 0.6627451 ,
          0.7019608 , 0.56078434],
         [0.5176471 , 0.60784316, 0.6784314 , ..., 0.5372549 ,
          0.62352943, 0.5803922 ]],
 
        [[0.42745098, 0.23137255, 0.30588236, ..., 0.6431373 ,
          0.7411765 , 0.62352943],
         [0.42745098, 0.2       , 0.34509805, ..., 0.7411765 ,
          0.7294118 , 0.6156863 ],
         [0.47058824, 0.27450982, 0.37254903, ..., 0.7764706 ,
          0.78431374, 0.5686275 ],
         ...,
         [0.40392157, 0.49019608, 0.56078434, ..., 0.79607844,
          0.7058824 , 0.

In [12]:
MyDNN=paddle.nn.Sequential(
    paddle.nn.Flatten(start_axis=1),
    paddle.nn.Linear(3*80*80,4096),
    paddle.nn.ReLU(),
    paddle.nn.Linear(4096,1024),
    paddle.nn.ReLU(),
    paddle.nn.Linear(1024,512),
    paddle.nn.ReLU(),
    paddle.nn.Linear(512,256),
    paddle.nn.ReLU(),
    paddle.nn.Linear(256,102)
)

W0712 11:16:22.206952   398 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1
W0712 11:16:22.211663   398 device_context.cc:465] device: 0, cuDNN Version: 7.6.


In [13]:
model=paddle.Model(MyDNN)
model.summary((1,3,80,80))

---------------------------------------------------------------------------
 Layer (type)       Input Shape          Output Shape         Param #    
   Flatten-1      [[1, 3, 80, 80]]        [1, 19200]             0       
   Linear-1         [[1, 19200]]          [1, 4096]         78,647,296   
    ReLU-1          [[1, 4096]]           [1, 4096]              0       
   Linear-2         [[1, 4096]]           [1, 1024]          4,195,328   
    ReLU-2          [[1, 1024]]           [1, 1024]              0       
   Linear-3         [[1, 1024]]            [1, 512]           524,800    
    ReLU-3           [[1, 512]]            [1, 512]              0       
   Linear-4          [[1, 512]]            [1, 256]           131,328    
    ReLU-4           [[1, 256]]            [1, 256]              0       
   Linear-5          [[1, 256]]            [1, 102]           26,214     
Total params: 83,524,966
Trainable params: 83,524,966
Non-trainable params: 0
--------------------------------

{'total_params': 83524966, 'trainable_params': 83524966}

In [14]:
model.prepare(
    paddle.optimizer.SGD(
        learning_rate=train_parameters['learning_strategy']['lr'],
        parameters=model.parameters()
    ),
    paddle.nn.CrossEntropyLoss(),
    paddle.metric.Accuracy()
)

In [15]:
model.fit(train_data,eval_data,epochs=train_parameters['num_epochs'],batch_size=train_parameters['train_batch_size'],verbose=1)

The loss value printed in the log is the current step, and the metric is the average value of previous steps.
Epoch 1/200


  return (isinstance(seq, collections.Sequence) and


Eval begin...
Eval samples: 799
Epoch 2/200
Eval begin...
Eval samples: 799
Epoch 3/200
Eval begin...
Eval samples: 799
Epoch 4/200
Eval begin...
Eval samples: 799
Epoch 5/200
Eval begin...
Eval samples: 799
Epoch 6/200
Eval begin...
Eval samples: 799
Epoch 7/200
Eval begin...
Eval samples: 799
Epoch 8/200
Eval begin...
Eval samples: 799
Epoch 9/200
Eval begin...
Eval samples: 799
Epoch 10/200
Eval begin...
Eval samples: 799
Epoch 11/200
Eval begin...
Eval samples: 799
Epoch 12/200
Eval begin...
Eval samples: 799
Epoch 13/200
Eval begin...
Eval samples: 799
Epoch 14/200
Eval begin...
Eval samples: 799
Epoch 15/200
Eval begin...
Eval samples: 799
Epoch 16/200
Eval begin...
Eval samples: 799
Epoch 17/200
Eval begin...
Eval samples: 799
Epoch 18/200
Eval begin...
Eval samples: 799
Epoch 19/200
Eval begin...
Eval samples: 799
Epoch 20/200
Eval begin...
Eval samples: 799
Epoch 21/200
Eval begin...
Eval samples: 799
Epoch 22/200
Eval begin...
Eval samples: 799
Epoch 23/200
Eval begin...
Eval

In [16]:
model.save('MyDNN')

In [31]:
def test_predict():
    test_data=FoodData(train_parameters['test_list_path'],train_parameters['img_path'])
    out=model.predict(test_data)
    lab=np.argmax(out,axis=-1)[0]
    res=[]
    with open(train_parameters['test_list_path'],'r',encoding='utf-8') as f:
        for i,line in enumerate(f.readlines()):
            res.append(line.strip()+'\t'+str(lab[i][0])+'\n')
    
    with open('result.txt','w') as f:
        for line in res:
            f.write(line)

In [32]:
test_predict()


Predict begin...
Predict samples: 1145
