# 💻Welcome to the DL201 Bootcamp 

## A New chapter on your Machine Learning journey

Welcome to the DL201 Bootcamp! This is a new bootcamp designed with collaboration and inclusion in mind, here we provide guidance as we learn as a community essential skills on data manipulation, important machine learning and AI concepts as well as the usage of existing software tools and pre-trained models. We would like to give skills that allow students to build a proof of concept of their desired project in a short time.


## 📕 Learning Objectives

* Acquire basic data manipulation skills that allow students to make use of machine learning and AI tools.
* Have an appreciation for important processing methods and transformations commonly required to load datasets into tensors.
* Build a foundation on using and fine-tuning pre-trained models.
* Apply obtained skills and knowledge to build a proof of concept of a project.

## Benefits of a boot camp

* Discussing concepts with peers is scientifically proven to be the best way to fully understand concepts rather than rely on formulas. 
* Increases people's ability to analyse unfamiliar situations
* Passive lectures don't offer value that online videos don't
* Can form relationships and a sense of community with classmates. 

## 📅 Weekly Progress

<table>
<tbody>
<tr class="odd">
<td>Week</td>
<td>Title</td>
<td>Learning Content</td>
</tr>
<tr class="even">
<td>0</td>
<td>Course deliverables and project setup</td>
<td><ol type="1">
<li><p>Meet your mentors, unpackers</p></li>
<li><p>Understand the bootcamp objectives and logistics</p></li>
<li><p>Receive the framework for building your AI project in this
course. Learn and build as you go over the first three weeks, finalize
the project by the end of 4th week.</p></li>
</ol></td>
</tr>
<tr class="odd">
<td>1</td>
<td>Data Loading and Exploratory Analysis</td>
<td><ol type="1">
<li><p>understand how to load datasets and metadata in the most common
file formats</p></li>
<li><p>explore the datasets and discern if it is suitable for adaptation
to our problem</p></li>
<li><p>gain tools to manipulate metadata and tabular data using
Pandas</p></li>
<li><p>have an appreciation for how data can be represented as tensors
Explore the data-centric approach in AI, and learn about its importance
in Machine &amp; Deep Learning.</p></li>
<li><p>Utilize tools to label, improve and balance your
datasets.</p></li>
</ol></td>
</tr>
<tr class="even">
<td>2</td>
<td>Data Preprocessing and Transformations</td>
<td><ol type="1">
<li><p>Common Data Wrangling tasks</p></li>
<li><p>Feature engineering</p></li>
<li><p>Grouping into Categories</p></li>
<li><p>Feature Decomposition</p></li>
<li><p>Tabular Data Transformation methods</p></li>
<li><p>Computer vision preprocessing techniques like label encoding,
handling unbalanced classes</p></li>
<li><p>CV Data transformation like normalizing pixel values</p></li>
<li><p>Text preprocessing for NLP tasks like encoding and
embeddings</p></li>
</ol></td>
</tr>
<tr class="odd">
<td>3</td>
<td>Algorithms and Model Training</td>
<td><ol type="1">
<li><p>The most common cutting edge DL algorithms and
architectures</p></li>
<li><p>Pretrained models</p></li>
<li><p>Hyperparameters</p></li>
<li><p>Fine-tuning of pretrained models</p></li>
</ol></td>
</tr>
<tr class="even">
<td>4</td>
<td>Project Finalization</td>
<td><ol type="1">
<li><p>Fully apply newly gained machine learning skills to your ongoing
project to deliver the final results.</p></li>
<li><p>Get on 1:1 calls with mentors to get personalized feedback and
recommendations before presenting it as a proof-of-concept project on
Demo Day.</p></li>
<li><p>Get published on our social media to feature your project,
receive the certificate and opportunity to become a mentor for future
cohorts</p></li>
</ol></td>
</tr>
</tbody>
</table>

## Software Requirements for the Bootcamp

In order to ensure a good learning experience, you will need to use the software listed out below. All of them expect the VPN are for free.

1. **Tencent Meeting**. In order to join live sessions with mentors and fellow students you have to install Tencent Meeting or Voov on your laptop.

> *We don't recommend to use your phone to join live sessions as your phone doesn't provide a proper online session environment, and will compromise the quality of live group interactions.*

2. **WeChat**. It's our main communication tool for every day communication. You will receive instructions, announcements, ask all sorts of questions related to the content, Bootcamp, and recieve answers on a timely manner in the WeChat group. 

3. **Kaggle Notebooks**. It is a platform for data science competitions. It is a great tool for data science competitions. You can use it to submit your code and get feedback from other participants.

## Getting to know each other
Let's briefly talk about:

- Your expectations for the Bootcamp.
- Previous any experience with AI.
- Your coding background.
- Your desired AI project.

## This bootcamp is hosted on GitHub!

Github is a universal tool in software development. It's full of features that allows hundreds to even thousands of people to collaborate online. It is also a social network that is suited for this course. It has a learning curve, but it pays off in the short term and long term. 

https://github.com/unpackAI/DL201/

Let's review how:
- Clone the repository.
- Use the Issues sections to report problems and ask/provide help.
- Use the discussion section.

## We recommend the usage of Kaggle Notebooks (live mini-demo)
Google colab and local environments are still supported but Kaggle is our recommended option. Kaggle is well-know portal for data science competitions and it has an enourmous collections of datasets.

https://www.kaggle.com/code

Important reasons to use Kaggle Notebooks:
- No VPN access required.
- 50+ hours of weekly GPU usage for free.
- Familiar environment with Jupyter.
- No need to install any software.
- Immediate access to the enourmous collection of datasets provided by Kaggle.

## Sample code: collect images form the PETS dataset to train a RESNET-based dogs vs cats classifier


In [None]:
# The purpose of the sample code is to test a Python environment.
# If this cell shows a pip error on Kaggle, simply try again or reload the notebook.
# Make sure you enable the GPU accelerator by clicking on the top right corner of the notebook.

# Install packages (comment if not required)
!pip install -Uqq  ipywidgets fastai fastbook

# Import dependencies for all sample AI applications (again, to test the environment)
import os
import numpy
import pandas
import torch
from fastai.vision.all import *
from fastai.text.all import *
from fastai.collab import *
from fastai.tabular.all import *
import ipywidgets as widgets
from IPython.display import Image
import fastbook
import urllib.request

import urllib.request
import requests
fastbook.setup_book()

### AI application sample: collect images form the PETS dataset to train a RESNET-based dogs vs cats classifier

In [None]:
"""
AI application sample: collect images form the PETS dataset to train a 
RESNET-based dogs vs cats classifier
"""

# Download images, navigate to the folder and display some of the images
image_path = untar_data(URLs.PETS)/'images'
os.chdir(image_path)
filenames = os.listdir('.')

def slider_callback(position):
    image_object = Image(filename=filenames[position], width=600)
    display(image_object)

widgets.interact(slider_callback, position=widgets.IntSlider(min=0, max=len(filenames), step=1))

In [None]:
# On this dataset, cat images filenames beggin with an uppercase letter
print(filenames[:11])

# Define a function that uses that property to select if a filename is a cat
def is_cat(filename):
    return filename[0].isupper()

# Create a dataloader
data_loader = ImageDataLoaders.from_name_func(
    path=image_path, fnames=get_image_files(image_path), label_func=is_cat, valid_pct=0.2, seed=42,
    item_tfms=Resize(224)
)

In [None]:
# Feed data to model and train, train with 2 epoch
"""
Note: Usually more epochs are required to achieve a good result
but given the quality of the dataset and the model in this case is enough.
"""

image_learner = cnn_learner(data_loader, resnet34, metrics=error_rate)
image_learner.fine_tune(2)

In [None]:
# Display a sample image for inference (a cat)
image_url = "https://i.postimg.cc/02Tv8pdc/sample-cat.jpg"
image_filename = "sample-cat.jpg"
Image(url=image_url, width=500, height=500)

In [None]:
# Donwload the image by using its url and filename
# One of several methods to donwload an image
urllib.request.urlretrieve(image_url, f"{os.getcwd()}/{image_filename}")   
image_data = PILImage.create(image_filename)

In [None]:
# Print inference results
prediction_label, p, probabilities = image_learner.predict(image_data)
print(f"Is this a cat?: {prediction_label}.")
print(f"Probability it's a cat: {probabilities[1].item():.6f}")

In [None]:
# Now is your turn, upload a image or use an url for inference, use the sample code above as a reference.
eval_image_url = "https://pulpbits.net/wp-content/uploads/2014/01/Calico-Siberian-Cat-1024x776.jpg" # Replace me
eval_image_filename = "sample_image.jpg"

imgage_object = requests.get(eval_image_url).content
with open(eval_image_filename, 'wb') as handler:
    handler.write(imgage_object)
    
image_data = PILImage.create(eval_image_filename)

In [None]:
# Print inference results
prediction_label, p, probabilities = image_learner.predict(image_data)
print(f"Is this a cat?: {prediction_label}.")
print(f"Probability it's a cat: {probabilities[1].item():.6f}")

## Resources
Provide a couple of links of articles or videos that you find helpful. As Week 1 have not begin yet it could be about general AI concepts, or about the specific topics of the bootcamp (python, pandas, etc.)

- https://www.techrepublic.com/article/why-85-of-ai-projects-fail/
- https://spotify.design/article/three-principles-for-designing-ml-powered-products
- Take a look at the AI Canvas located at: https://www.predictionmachines.ai/

### Discussion / questions:

1.Do you need these for deep learning :
- Lots of math ?
- Lots of data ?
- Lots of expensive computers ?
- A PhD ?

2.If a human can see a pattern in an image, should a CV Model be able to detect it ?
- Yes
- No