# MeetUp 185 - Beginners' Python and Machine Learning - 21 Jun 2023 - Introduction to Colab

Learning objectives:
- python in colab
- markdown in colab
- command line from colab
- third party libraries in colab
- file system in colab

Links:
- Colab:   https://colab.research.google.com/drive/1Rb_h5koOeepO4iDVUEpFHxtAr0Z9327v
- Youtube: https://youtu.be/aeugWw4Okw4
- Meetup:  https://www.meetup.com/beginners-python-machine-learning/events/294077914/
- Github:  https://github.com/timcu/bpaml-sessions/tree/master/online

@author D Tim Cummings

Challenge 1: Install Software
- Use a Google account to go to https://colab.research.google.com
 - Doesn't need Python installed on your computer
 - Great for interactive use and data science
 - Not so good for scripting and building apps



# Markdown
Markdown is a superset of html https://www.markdownguide.org/

<h3>Using html</h3>
<table><tr><th>One</th><th>Two</th></tr><tr><td>Three</td><td>Four</td></tr></table>

<hr>

### Using markdown specifics to create a table

|Five|Six|
|---|---|
|Seven|Eight|

---

- Create Python cells using the `+ Code` button
- Create Markdown cells using the `+ Text` button

# Latex

Google colab markdown also recognises latex in $ signs

$\sigma(z_i) = \frac{e^{z_{i}}}{\sum_{j=1}^K e^{z_{j}}} \ \ \ for\ i=1,2,\dots,K$

https://ashki23.github.io/markdown-latex.html

<h3>Images</h3>
<p>Can use html for images</p>
<p align="center">
<img src="https://www.python.org/static/img/python-logo.png" alt="Python" title="python">
</p>

# Using Google Colab / Jupyter Notebooks / IPython

- type into a cell
- press `<shift><enter>` to execute the cell
- cells can be python code or markdown text
- use ? or Help menu for help
- see Tools menu > Keyboard shortcuts

In [None]:
# comments in code cells start with a '#'
# How to store a value in a variable
a = 5
print(a)

In [None]:
# values are remembered between cells
# IPython will automatically display the result of the last line
a + 6

In [None]:
# try executing ? to bring up IPython help
# If you are using Python Console this is not available
?

GETTING HELP
------------

Within IPython you have various way to access help:

  - `?`         -> Introduction and overview of IPython's features (this screen).
  - `object?`   -> Details about 'object'.
  - `object??`  -> More detailed, verbose information about 'object'.
  - `%quickref` -> Quick reference of all IPython specific syntax and magics.
  - `help`      -> Access Python's own help system.

In [None]:
# Task 1 - get help on `print`

In [None]:
# Solution 1
print?

In [None]:
# Task 2 - Search the IPython quick reference

In [None]:
# Solution 2
%quickref

In [None]:
# Task 3 - Search info on IPython's magic operator and find how to do a cell oriented timing

In [None]:
# Solution 3
%magic

In [None]:
# Task 3 - Time how long it takes for python to calculate the 100th power of x when x is 2 to the power 100

In [None]:
%%timeit x=2**100
x**100

In [None]:
# Task 4 - From the quickref card work out how to determine the underlying operating system
# using the os command `cat /etc/os-release`

In [None]:
# Solution 4
!cat /etc/os-release

In [None]:
# Find what is the default python version
!python3 --version

In [None]:
# In code you can find the python version using the sys library
import sys
sys.version

In [None]:
# How much memory have you been allocated
!free -m

In [None]:
# How much storage space
!df -h

In [None]:
# What shell is offered
!echo $SHELL

In [None]:
# Which user is running shell
!whoami

In [None]:
# How to upgrade ubuntu with latest security patches
!apt update
!apt upgrade

In [None]:
# Which version of pip
!pip --version

In [None]:
# Find out what third party packages are already installed
!pip list

In [None]:
# Upgrade third party library Pillow to latest version from pypi.org. Restart the runtime then rerun previous cell to ensure updated
!pip install --upgrade Pillow

In [None]:
# Install new package xlsxwriter from pypi.
!pip install xlsxwriter

In [None]:
# Install package from source https://detectron2.readthedocs.io/en/latest/tutorials/install.html
!pip install 'git+https://github.com/facebookresearch/detectron2.git'

In [None]:
# If a library can't be installed using pip then you can use ubuntu package manager apt
# Here is an example with a package which could be installed either way
# These days `pip install` is the preferred way of installing
# https://github.com/briancurtin/deprecation
!sudo apt install python3-deprecation

In [None]:
# Check with pip that it was installed
!pip list | grep deprecation

In [None]:
# Check in Python that it is now importable
import deprecation

In [None]:
# Can install ubuntu packages to augment the default python
!apt install python3.10-venv
# Now we can create virtual environments
!python3 -m venv my-venv
# Unfortunately they only stay active for one line but can have several commands on one line
!source my-venv/bin/activate;pip list

In [None]:
# Create a folder in Google drive called 'kaggle'
# Download kaggle.json from https://www.kaggle.com/settings and save in this folder
# Only have to do this once

In [None]:
# connect google drive to the google colab
from google.colab import drive
drive.mount("/content/gdrive")

In [None]:
# Import kaggle now has access to your api token
import os
os.environ['KAGGLE_CONFIG_DIR'] = "/content/gdrive/MyDrive/kaggle/"
import kaggle

In [None]:
# For example to download from competition bluebook-for-bulldozers
# https://www.kaggle.com/c/bluebook-for-bulldozers
!kaggle competitions download -c bluebook-for-bulldozers

In [None]:
!unzip bluebook-for-bulldozers.zip -d bluebook-for-bulldozers

In [None]:
!ls -al bluebook-for-bulldozers

In [None]:
# Check the data in the csv file
!head bluebook-for-bulldozers/TrainAndValid.csv

In [None]:
# Note that changing directory in shell doesn't change directory in python
import pandas as pd
df = pd.read_csv('bluebook-for-bulldozers/TrainAndValid.csv')
df.head