# "Fastai Chapter 2"
> "Want to find Modi beard from 2014 or 2019"

- toc: false
- branch: master
- badges: true
- comments: true
- categories: [fastpages, jupyter]
- image: images/some_folder/your_image.png
- hide: false
- search_exclude: true
- metadata_key1: metadata_value1
- metadata_key2: metadata_value2

## What is Fastai 

Fastai is front runner and research organisation working on making Deep Learning as democratic as possible with beautiful community support. Fastai in a way cutshorts the path needed for one to become a great ML practitioner and honestly stands to the title of its popular book: [__Deep Learning for coders with fastai & PyTorch (AI Applicaitons without a PhD)__](https://www.amazon.in/Deep-Learning-Coders-fastai-PyTorch/dp/1492045527/ref=sr_1_1?dchild=1&keywords=jeremy+howard&qid=1627333514&sr=8-1). I sincearly want to thank __Mr Jeremy Howard and Mr Sylvain Gugger__ for presenting such an awesome gem in just 600 pages. The series of posts that I am going to write on this book is in a way to solidfy my understanding of the concepts taught in the book and practice writing as encouraged greatly by authors and community. I hope this turns out to be good source to refer for me and beginners who are just behind me. 

## What tools do you need

Mind you we are just beginners and we dont need anything more than [Google colab scratchpad](https://colab.research.google.com/notebooks/empty.ipynb). This will spinup notebook with free GPU (Graphical Processing Unit) for you with just click of a mouse. Great colab tips can be found [here](https://amitness.com/2020/06/google-colaboratory-tips/). Full server notebooks like GCP and AWS instances are still difficult to setup for complete beginners like me. Believe me, I have spent one whole month in understanding how to setup GCP instance. Now almost all the time I spinup colab for any course work for its ease and simplicity. People like [Zachery Mueller](https://github.com/muellerzr), who is like poster child for fastai community, prefers to practice with colab.  

## Domains of Deep Learning

Deep learning can be applied to wide variety of tasks and tinkering mindset can force one to enable applying it in creative ways. The increasing availabilty of data in any field will provide possible use case and can leverage new insights. However, the current research and practice broadly falls under following categories.

### Computer Vision:

What do you think Tesla FSD (Full Self Driving) cars equipped with for its Autopilot system? Computer vision plays central bedrock role for autonomous driving. Today computers are as good as people in identifying objects in an image using their neural networks. We take up computer vision in this chapter and slowly expand to other domains as course progress. Any deep learning system predictions depend on the quality of the training data that we provide. Certainly this book provides tools and strategies to get state of the art accuracy with innovative methods even with limited data.

### Natural Language Processing NLP:
 
Did you ever hear GPT-3, GPT-2, Transformers, Huggingface, twitter bots, google translate etc. in deep learning space, then all these terms belong to text domain of deep learning. This domain deals with how computers handles text data and its response for a specified task. Today NLP tasks ranges from classifying documents, sentiment analysis (positive, negetive), summarising documents, context appropriate text generation etc. Downside of technology ranges from 'throwing abuses by bot' to 'bot generated twitter trolls at massive scale'. However, Google translate massively used by people in ever increasing multicultural society of today.

### Tabular Data & RecSys:

Most common ML applications we find at industry is tabular data based. Though traditional ML techniques like random forests and gradient boost techniques still hold good, deep learning offlate making strides into tabular data domain. RecSys - Recommendation System is also type of tabular data but the data is highly cardinal and deep learning can be applied to such data with good results. E-commerce websites like amazon applies deep learning techniques for its user-table to recommend products.  

## Thinking Approach

When solving any problem at hand with ML and deep learning techniques, the fundamental thinking approach should be the predictions of the model should be helpful to the real world. Jeremy Howard named this approach as __Drivetrain Approach__ and his elegant presentation can be seen [here](https://www.youtube.com/watch?v=vYrWTDxoeGg&ab_channel=O%27Reilly). His key ideas are as follows:
1. Objective: What outcome am I trying to achieve
2. Levers: What inputs can we control
3. Data: What data can we collect
4. Models: How the levers influence the objective

Thouogh seemingly simple concept, its application in real world has great impact on the outcome of the modeling. Ingeneral the data we have on hand may not be causal and the outcomes may not represent the actual conditions. If we can control certain input conditions by injecting randomness the causal relationships can be collected and new data will become food for the model and thus accuracy of the predictions can be improved.     

## Lets Dive In

To make your first dive in experience into deep learning little more interesting, I have choosen a "Beard Modi" project. Interesting..! yes. You upload any of the present Indian PM "Narendra modi" image, my app will tell wether the image is "2014 Modi" or "2021 Modi". This we are doing with computer vision deep learning techniques. This notebook will detail each and every step of the process.

### Gathering Data
Though "Bing Image Search" was the primary source referred by book, offlate course.fast.ai website given alternative DuckDuckGo as this one dont need any key. Please read docs for more information on [this](https://course.fast.ai/images#DuckDuckGo). 


In [1]:
#hide
!pip install fastbook -Uqq
import fastbook
fastbook.setup_book()

In [5]:
from fastbook import *

In [14]:
#hide
def search_images_ddg(term, max_images=200):
    "Search for `term` with DuckDuckGo and return a unique urls of about `max_images` images"
    assert max_images<1000
    url = 'https://duckduckgo.com/'
    res = urlread(url,data={'q':term})
    searchObj = re.search(r'vqd=([\d-]+)\&', res)
    assert searchObj
    requestUrl = url + 'i.js'
    params = dict(l='us-en', o='json', q=term, vqd=searchObj.group(1), f=',,,', p='1', v7exp='a')
    urls,data = set(),{'next':1}
    while len(urls)<max_images and 'next' in data:
        try:
            data = urljson(requestUrl,data=params)
            urls.update(L(data['results']).itemgot('image'))
            requestUrl = url + data['next']
        except (URLError,HTTPError): pass
        time.sleep(0.2)
    return L(urls)

## Working with Bears First
First we will learn how to train our first ML model to recognise bears images then we will go ahead with "Modi Beard Experimentation". In a way this ensures I am on right track and not messedup anything.

#### Downloading Images with DuckdDuckGo API

In [17]:
bear_types='grizzly','black','teddy'
path=Path('bears')
if not path.exists():
  path.mkdir()
  for x in bear_types:
    dest=(path/x)
    dest.mkdir(exist_ok=True)
    urls=search_images_ddg(f'{x} bear',max_images=150)
    download_images(dest,urls=urls)


In [19]:
urls

(#194) ['https://www.thetoyshoppe.com/images/productslarge/55100c.jpg','http://www.carolwrightgifts.com/gund-teddy-bear_11218_zoom0.jpg','https://image.made-in-china.com/2f0j00NdUtRvuYOLbp/2017-New-Design-Scary-Teddy-Bear-Zombie-Bear-Undead-Bear-Bloody-Bear-Horrible-Teddy-Bear.jpg','https://cdn0.rubylane.com/shops/700632/I-1277.1L.jpg','https://pngimg.com/uploads/teddy_bear/teddy_bear_PNG90.png','https://i2-prod.birminghammail.co.uk/incoming/article17583074.ece/ALTERNATES/s1227b/0_IFP_BEM_160120bears011JPG.jpg','https://d3n8a8pro7vhmx.cloudfront.net/accessalliance/pages/454/meta_images/original/Cute_but_Lonely_Teddy_Bear_Sitting_on_Grass.jpg?1591840883','https://www.onlinetoys.com.au/23673/oscar-teddy-bear-27cm-beige.jpg','http://www.alux.com/wp-content/uploads/2013/12/5.png','http://mylitter.com/wp-content/uploads/2019/02/bear.jpg'...]