# Notes on Fastai v2 Walk-thru 1 

#### Development using notebooks


In a development notebook, any cell with a `#export` tag at the beginning will be exported a '.py' file that corresponds to the notebook.  The '.py' file is auto-generated from its notebook.  

The code that is exported into the '.py' file should be the same as in the notebook, with the exception of `import` statements.  Something like `from local.torch_basics import *` becomes `from ..torch_basics import *` in the auto-generated '.py' file, in order to set the paths to the modules correctly.

At the end of a development notebook, there is the cell:
```python
#hide
from local.notebook.export import *
notebook2script(all_fs=True)
```
.  When this is run, all the notebooks are converted into their '.py' files. 

The '.py' files are regular Python modules.  Whilst most of them are auto-generated, there are exceptions, with one being `core.imports`.

#### Tests as documentation

The tests in the development notebooks serve two purposes:

1. They verify that the code is correct.
2. They inform how the code can be used.

Here is a way to run the tests in the notebooks from the terminal:
```bash
for i in {0,1,2}*.ipynb; do sleep 1; python run_notebook.py --fn $i & done
```

### Pets Tutorial

This was notebook 08 in the video.  At the time of writing, it's notebook 10.  

In [None]:
from local.test import *
from local.data.all import *
from local.vision.core import *

In [None]:
source = untar_data(URLs.PETS)/"images"

In [None]:
items = get_image_files(source)

In [None]:
items

(#7390) [/Users/jack/.fastai/data/oxford-iiit-pet/images/Egyptian_Mau_167.jpg,/Users/jack/.fastai/data/oxford-iiit-pet/images/pug_52.jpg,/Users/jack/.fastai/data/oxford-iiit-pet/images/basset_hound_112.jpg,/Users/jack/.fastai/data/oxford-iiit-pet/images/Siamese_193.jpg,/Users/jack/.fastai/data/oxford-iiit-pet/images/shiba_inu_122.jpg,/Users/jack/.fastai/data/oxford-iiit-pet/images/Siamese_53.jpg,/Users/jack/.fastai/data/oxford-iiit-pet/images/Birman_167.jpg,/Users/jack/.fastai/data/oxford-iiit-pet/images/leonberger_6.jpg,/Users/jack/.fastai/data/oxford-iiit-pet/images/Siamese_47.jpg,/Users/jack/.fastai/data/oxford-iiit-pet/images/shiba_inu_136.jpg...]

In [None]:
split_idx = RandomSplitter()(items)

In [None]:
split_idx

((#5912) [6320,1028,5912,4275,5888,1029,508,1484,4928,6537...],
 (#1478) [7112,1156,4299,5298,2499,3851,7001,532,5822,510...])

Anything that returns something that can be called will have a name with the first letter capitalized.  Conventionally, in Python, only `class`es have the first letter capitalized.  But here, for example, `RandomSplitter` is a function, which returns a function.  

#### The `show()` method

A *type* with a `show` method is something that can be used by Fastai's transformation pipeline to show, or display, something.  For example,

In [None]:
class TitledImage(Tuple):
    def show(self, ctx=None, **kwargs): show_titled_image(self, ctx=ctx, **kwargs)

is a type that knows how to show itself as a titled image.  Its `show` method can plot an image with a label in the title.

#### Type system

The type annotations are not just for type-checking. They also add semantics to tensors, informing what kind of thing a tensor represents (an *image*, for example).  A same piece of code can also act differently and appropriately on different objects of different types.  For example, if something is of type `Image`, then, when displayed, it should show a picture of some sort, and because it is an image, it would make sense to apply a rotation transformation, etc.

In [None]:
def resized_image(fn:Path, sz=128):
    x = Image.open(fn).convert('RGB').resize((sz,sz))
    # Convert image to tensor for modeling
    return tensor(array(x)).permute(2,0,1).float()/255.

`fn:Path` means that the first positional argument of `resized_image` is called `fn`, and it is expected to be of type `Path`.  Since `resized_image` is a normal Python function, `fn` doesn't really needs to be of type `Path`.

In [None]:
type(resized_image(items[11]))

torch.Tensor

In [None]:
type(resized_image(str(items[11])))

torch.Tensor

#### `Transform`

Things of type `Transform` are *transforms*.  In general, they have a `encodes` method, which converts the data into a form that is closer to being able to be used for modeling.  So, between the raw data and something that can be directly input into a model, there is one or more transforms.  

Transforms tend to convert raw data into something that is less directly intelligible by us.  For example, we can look at an image and be able to see what things are in it, but not when it's been converted to a tensor, which is required for use by a model.  Therefore, transforms can also have a `decodes` method, which can convert data from a form that is not very intelligble by us to one that is more intelligible by us.  

Here is a simple example of a transform that converts a path of an image file to an image and its category label.  How it encodes is defined in the `encodes` method; how it decodes is defined in the `decodes` method.  After the transform is created, calling it calls the `encodes` method, while `.decode` calls the `decodes` method.  

In [None]:
class PetTfm(Transform):
    def __init__(self, items, train_idx):
        self.items,self.train_idx = items,train_idx
        self.labeller = RegexLabeller(pat = r'/([^/]+)_\d+.jpg$')
        vals = map(self.labeller, items[train_idx])
        self.vocab,self.o2i = uniqueify(vals, sort=True, bidir=True) 

    def encodes(self, i):
        o = self.items[i]
        return resized_image(o), self.o2i[self.labeller(o)]
    
    def decodes(self, x): return TitledImage(x[0],self.vocab[x[1]])

In [None]:
pets = PetTfm(items, split_idx[0])

In [None]:
x,y = pets(0)
x.shape,y

In [None]:
dec = pets.decode((x,y))
dec.show()

`Transform` enforces type annotations.  Here is an example.  The function `resized_image` can be converted into a transform, so that the transform's `encodes` method implements it.  Now, the input to the transform *really* needs to be of the type `Path`, otherwise the transform does nothing and just returns the input:

In [None]:
Transform(resized_image)

Transform: False (Path,object) -> resized_image 

In [None]:
type(Transform(resized_image)(items[2]))

torch.Tensor

In [None]:
Transform(resized_image)(str(items[2]))

'/Users/jack/.fastai/data/oxford-iiit-pet/images/basset_hound_112.jpg'

In this way, it is possible to define a transform that can handle different inputs of different types. 

#### `uniqueify`

The `uniqueify` function takes a list of values and turn it into a vocabulary, a list of the unique values of the given list.  It also has a `bidir` keyword argument, which, when set to `True`, will also return the reverse mapping, from index to object.

In [None]:
vocab,o2i = uniqueify(vals, sort=True, bidir=True)

#### `L`

This is Fastai's own sequence type that's meant to be analogous to Python's `list`.  

In [None]:
a = L(5, 6, 8)

In [None]:
a

(#3) [5,6,8]

In [None]:
a.map(operator.neg)

(#3) [-5,-6,-8]

In [None]:
a + 0, 5 + a

((#4) [5,6,8,0], (#4) [5,5,6,8])

In [None]:
L([5, 'h'], [9, 'j'], [100, 'k']).itemgot(0)

(#3) [5,9,100]

#### `Pipeline`

`Pipeline` can be used to compose a sequence of transforms, so they will be applied one after another.

#### `TupleTransform`

`TupleTransform` is a special type of transform, which, when given a tuple to encode, will apply its `encodes` method to each element in the tuple.  This is convenient because things tend to be grouped in a tuple along a transform pipeline, so instead of having to apply a transform to each element inside a loop, just use `TupleTransform`.  Note that it has to be tuples, not list, because Pytorch's dataloader only works with tuples. 

Note that the only difference between `Transform` and `TupleTransform` is that `TupleTransform` has its attribute `as_item_force` set to `False`.  This attribute specifies whether the transform type should apply its `encodes` method to the input as a whole, or to each element of the input.  

In [None]:
TupleTransform??

#### `DataSource`

`DataSource` applies a transform, or transform pipeline, to a list of items.  

In [None]:
class ImageResizer(Transform):
    order=10
    "Resize image to `size` using `resample`"
    def __init__(self, size, resample=Image.BILINEAR):
        if not is_listy(size): size=(size,size)
        self.size,self.resample = (size[1],size[0]),resample

    def encodes(self, o:PILImage): return o.resize(size=self.size, resample=self.resample)
    def encodes(self, o:PILMask):  return o.resize(size=self.size, resample=Image.NEAREST)

In [None]:
tfms = [[PILImage.create, ImageResizer(128), ToTensor(), IntToFloatTensor()],
        [labeller, Categorize()]]
dsrc = DataSource(items, tfms)

Its two main input arguments are `items`,  a list of items, and `tfms`, a list of list of transforms, .  In general, `tfms` is of length 2, with the first element being the list of transforms for the independent variable, the second element being the list of transforms for the dependent variable.

#### Type Dispatch

`ImageResizer` above, when given a `PILImage`, will resample using `Image.BILINEAR` scheme.  When given a `PILMask`, it will resample using `Image.NEAREST` scheme.  This is called *type dispatch*.

In general, most types in Fastai has a `create` class method that can be used to create an instance of that type.

# - fin