# How to use Colab with Fastai  
Loading input images from a different location  
Test image folder on mounted Google Drive  
"/contents/gdrive/My Drive/Colab/Notebooks/data/cats_data/"  
  * sub-folder "cats" have image files  
  * cats_labels.csv has comma separated image labels.  

### S1. Setup Fasai library and load Fastbook content, run Fastbook setup.  


In [1]:
# Fastai, load book content, fastai v2 pkg, run fastbook setup
!pip install -Uqq fastbook
import fastbook
fastbook.setup_book()

[K     |████████████████████████████████| 727kB 7.2MB/s 
[K     |████████████████████████████████| 204kB 33.2MB/s 
[K     |████████████████████████████████| 1.2MB 32.5MB/s 
[K     |████████████████████████████████| 51kB 5.3MB/s 
[K     |████████████████████████████████| 61kB 9.7MB/s 
[K     |████████████████████████████████| 61kB 10.2MB/s 
[?25hMounted at /content/gdrive


In [2]:
!pwd
!ls

/content
gdrive	sample_data


In [3]:
# import all modules from fastbook
from fastbook import *

In [5]:
# check Path object
Path?
'''
Loaded successfully, pathlib.py, part of fastai setup.  
Init signature: Path(*args, **kwargs)
Docstring:     
PurePath subclass that can make system calls.
'''

'\nLoaded successfully, pathlib.py, part of fastai setup.  \nInit signature: Path(*args, **kwargs)\nDocstring:     \nPurePath subclass that can make system calls.\n'

In [34]:
# defin path to my image folder, cats
path = Path('/content/gdrive/My Drive/Colab Notebooks/data/cats_data')
pathImg = path / 'cats'
print(path, "\n" ,pathImg)


/content/gdrive/My Drive/Colab Notebooks/data/cats_data 
 /content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats


In [None]:
# Notebook code to change working dir, "%cd"

%cd '/content/gdrive/My Drive/Colab Notebooks/'
!pwd
!dir
# Worked!

### S2. Now how do I read image files into fastai Data Block and Data Loaders?  

In [18]:

(pathImg).ls()

(#15) [Path('/content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats/cats_05.jpg'),Path('/content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats/cats_01.jpg'),Path('/content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats/cats_02.jpg'),Path('/content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats/cats_03.jpg'),Path('/content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats/cats_04.jpg'),Path('/content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats/cats_00.jpg'),Path('/content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats/cats_06.jpg'),Path('/content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats/cats_07.jpg'),Path('/content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats/cats_08.jpg'),Path('/content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats/cats_09.jpg')...]

In [37]:
fname = (pathImg).ls()[0]
fname

Path('/content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats/cats_05.jpg')

In [40]:
# re.findall(r'(.+)_\d+.jpg$', fname.name)
re.findall(r'(.+).jpg$', fname.name)

['cats_05']

In [22]:
RegexLabeller(r'(.+)_\d+.jpg$')

<fastai.data.transforms.RegexLabeller at 0x7fc657a326d0>

In [29]:
# change path to image folder level.
path = pathImg
path

Path('/content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats')

In [30]:

pets = DataBlock(blocks = (ImageBlock, CategoryBlock),
                 get_items=get_image_files, 
                 splitter=RandomSplitter(seed=42),
                 get_y=using_attr(RegexLabeller(r'(.+)_\d+.jpg$'), 'name'),
                 item_tfms=Resize(460),
                 batch_tfms=aug_transforms(size=224, min_scale=0.75))
dls = pets.dataloaders(path)

In [31]:
dls

<fastai.data.core.DataLoaders at 0x7fc6587c2c10>

In [33]:
pets

<fastai.data.block.DataBlock at 0x7fc656f33190>

In [32]:
dls.show_batch(nrows=1, ncols=3)

ValueError: ignored

In [35]:
# To debug [data block], we encourage you to use the summary method. pets1.summary(path/"images")

pets1 = DataBlock(blocks = (ImageBlock, CategoryBlock),
                 get_items=get_image_files, 
                 splitter=RandomSplitter(seed=42),
                 get_y=using_attr(RegexLabeller(r'(.+)_\d+.jpg$'), 'name'))
pets1.summary(pathImg)  # Summary here

Setting-up type transforms pipelines
Collecting items from /content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats
Found 15 items
2 datasets of sizes 12,3
Setting up Pipeline: PILBase.create
Setting up Pipeline: partial -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False}

Building one sample
  Pipeline: PILBase.create
    starting from
      /content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats/cats_13.jpg
    applying PILBase.create gives
      PILImage mode=RGB size=225x225
  Pipeline: partial -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False}
    starting from
      /content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats/cats_13.jpg
    applying partial gives
      cats
    applying Categorize -- {'vocab': None, 'sort': True, 'add_na': False} gives
      TensorCategory(0)

Final sample: (PILImage mode=RGB size=225x225, TensorCategory(0))


Collecting items from /content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats
Found 15 items
2 dat

RuntimeError: ignored

In [None]:
# Resize all images manually to 225 x 225 x 3 channel.  
# Readin labels into Pandas, get_y is labels.  
# Woohoo!  Images did get loaded into DataBlock and DataLoader


### S. Try with example for nb2, bears.

In [45]:

bears = DataBlock(
    blocks=(ImageBlock, CategoryBlock), 
    get_items=get_image_files, 
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    #get_y=parent_label,
    get_y=using_attr(RegexLabeller(r'(.+).jpg$'), 'name'),
    item_tfms=Resize(128), 
    batch_tfms=aug_transforms(size=128, min_scale=0.75))
    # item size, batch transform size, changed from 225 to 128.

bears.summary(pathImg)  # Summary here    
dls = bears.dataloaders(pathImg)
dls

Setting-up type transforms pipelines
Collecting items from /content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats
Found 15 items
2 datasets of sizes 12,3
Setting up Pipeline: PILBase.create
Setting up Pipeline: partial -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False}

Building one sample
  Pipeline: PILBase.create
    starting from
      /content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats/cats_13.jpg
    applying PILBase.create gives
      PILImage mode=RGB size=225x225
  Pipeline: partial -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False}
    starting from
      /content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats/cats_13.jpg
    applying partial gives
      cats_13
    applying Categorize -- {'vocab': None, 'sort': True, 'add_na': False} gives
      TensorCategory(10)

Final sample: (PILImage mode=RGB size=225x225, TensorCategory(10))


Collecting items from /content/gdrive/My Drive/Colab Notebooks/data/cats_data/cats
Found 15 items


<fastai.data.core.DataLoaders at 0x7fc656ea5a50>

In [48]:

#dls?
dls.show_batch()[0]

ValueError: ignored