# Intro to Colab


Colab &#40;or "Colaboratory"&#41; allows you to write the python code in your browser  
- without any required config
- any access to GPU
- Sharing is simple

Here is for example a code cell with a short script in Python that calculates a value, stores it in a variable and prints the result:

In [None]:
seconds_in_a_day = 24 * 60 * 60
# seconds_in_a_day
print(seconds_in_a_day)

To run the code in the cell above, select it by clicking on it, then click the play button to the left of the code or use the keyboard shortcut Command/Ctrl+Enter or Shift+Enter. To edit the code, just click on the cell.

Variables you define in one cell can be used later in other cells:

In [None]:
seconds_in_a_week = 7 * seconds_in_a_day
seconds_in_a_week

Colab notebooks let you use executable code , rich text , images , HTML , LaTeX code and more in a single document.  
When you create Colab notebooks, they are saved to your Google Drive account. You can easily share them with your collaborators or friends, who can then add comments or even modify them. To learn more, see the  <a href="/notebooks/basic_features_overview.ipynb">Colab overview</a> page. 

Colab notebooks are Jupyter notebooks hosted by Colab. To learn more about the Jupyter project, see the <a href="https://www.jupyter.org">jupyter.org</a> website.

# Miscellaneous
> A glimpse to Data Science



Colab lets you take full advantage of popular Python libraries to analyze and visualize data. The code cell below uses numpy to generate random data and matplotlib to visualize it

Importing in colab is pretty straightforward, all you have to do is ` import library_name` 

**NOTE**: If the library is not pre-installed in the colab, you can always `!pip install`

In [None]:
import numpy as np
from matplotlib import pyplot as plt

ys = 200 + np.random.randn(100)
x = [x for x in range(len(ys))]

plt.plot(x, ys, '-')
plt.fill_between(x, ys, 195, facecolor='g', alpha=0.6)

plt.title("Sample Visualization")
plt.show()

A glimpse on why we should be using ```numpy``` over python default libraries  

Lets see this by calculating the **mean** using _traditional pythonic way vs the numpy way_

In [None]:
# defining a random list of 100 elements
ys = 200 + np.random.randn(100)
len(ys)

In [None]:
# Calculating the time performance for mean calculation using numpy  
import time
start_time = time.time()
ys_mean = np.mean(ys)
end_time = time.time() - start_time
print(end_time)

In [None]:
# Calculating the time performance for mean calculation using traditional python  
start_time_py = time.time()
sum(ys)/len(ys)
end_time_py = time.time() - start_time_py
print(end_time_py)

Jupyter/colab also allows you to perform the bash commands  in the pythonic environment  - ```bash friendly```

In [None]:
!pwd

In [None]:
!ls -la

In [None]:
ls -la | grep something

# Overview of the session

### Why this training ?

This training is conducted for engineers to better understand the data cycle and different components and titles involved into it. 
- Data Engineers
- Data Analyst
- Machine Learning
- MLOps (ML + DevOps)

Diagram of the Data cycle

![image](https://drive.google.com/uc?id=1oY1UwyrbmPOEV7dtKzEYDUFfeUYv5BAK)

Why is Data important in todays era ?  

We live in an information era, where billions of data is produced every second some of the biggest companies have gone full ***data-driven*** to better understand their customer and their needs.  
eg: Youtubes using recommendation system  
Google swift keyboard (understanding your key strok pattern)

Data helps a company to better understand their customers than a traditional system.  
```Data Science helps you be data-driven```

True data science is collaboration of CS, Math and Business (Venn diagram), wherein we convert the **Noisy Data —> Insights & Knowledge —> Actions**  

![DataSci](https://drive.google.com/uc?id=1cCyY6hU7xkWj5gGa0G8JONGmpyGGBEd5)

The purpose of Data Science is to tell you a story and help you visualize it.  
- Get lot more insights from the data, that would not be very obvious
- Making faster decisions
- Eliminating the human intervention where possible.

Today we'll look at how to train a classifier using ML to identify good wines from bad wines. 

**Summary**: wine tasting is a real human task performed by us since very long time untill the AI took over. Wine quality prediction being a difficult task, even the experts (so called **sommelier**) have the accuracy of around ```71%```  


Before we jump to much deeper into Machine Learning, lets see the general ```terminologies between AI & Machine Learning``` ? And why are they interchangeable sometimes ? 

![image AI vs ML](https://drive.google.com/uc?id=1kblaDfOVmgfTQnU34iqO3cqQE75CX0Lc)

*When you’re fundraising, it’s AI*   
*When you’re hiring, it’s ML*  
-- Baron Schwartz 