<a href="https://colab.research.google.com/github/kalz2q/mycolabnotebooks/blob/master/pragailab01.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# メモ

Pragmatic AI Labs というところが作ったファイルをColabで開いて読んでいるところ。

Colabは学習生産性が高い素晴らしいツールだと思いいろいろ作っているが、自分以外の人がどのように入門書をColabで作るかに興味があって見つけたもの。

元ファイルの表題は
 **Essential Machine Learning and Exploratory Data Analysis with Python and Jupyter Notebook**
で、ファイル名は`Public-Master-SafariOnline-Day1-Part1.ipynb`
で、URLは
https://github.com/noahgift/functional_intro_to_python#safari-online-training--essential-machine-learning-and-exploratory-data-analysis-with-python-and-jupyter-notebook

読みながら変更するので、コピーを
`https://github.com/kalz2q/mycolabnotebooks/`
に置く。



## Part 1.1: Python, Jupyter, Colab 入門


### シェルコマンド

`!ls -l`のシェルコマンドから入っている。


In [None]:
!ls -l

total 4
drwxr-xr-x 1 root root 4096 Jul 30 16:30 sample_data


Colabでこれを実行すれば、当然Colabが用意してくれた環境が示される。

シェルコマンドはiPython、Jupyter、Colabとも同じだが、Colabの環境が示されるので面白い導入だと思う。

In [None]:
var = !ls -l
type(var)

IPython.utils.text.SList

`type()`で型を調べると、`IPython.utils.text.SList`という型名が返ってきて、JupyterがiPYthonの上にあることがわかって、これも面白い導入だと思う。

In [None]:
# var.fields?

In [None]:
var.fields()

[['total', '4'],
 ['drwxr-xr-x',
  '1',
  'root',
  'root',
  '4096',
  'Jul',
  '30',
  '16:30',
  'sample_data']]

In [None]:
var.grep("data")

['drwxr-xr-x 1 root root 4096 Jul 30 16:30 sample_data']

この辺は何をやっているのかわからない。

環境が変わったときのために`data`の探し方を示しているのだろうか。

次のようにすればいいだけだと思う。

In [None]:
!ls -l | grep 'data'

drwxr-xr-x 1 root root 4096 Jul 30 16:30 sample_data


### Colab の特徴


*  実行にTPUとGPUを使用できる。
*  Jupyterノートブックを読み込める。
*  ノートブックを Google Drive 容量まで保存できる。
*  Github と連携できる。
*  ローカルな実行環境で実行できる。
*  Colab のフォームを作れる [forms in Colab](https://colab.research.google.com/notebooks/forms.ipynb)。





### Google Drive をマウントする

ちょっと危険そうなのでパスしよう。、


In [None]:
# from google.colab import drive
# drive.mount('/content/gdrive', force_remount=True)

In [None]:
import os;os.listdir("/content/gdrive/My Drive/awsml")

['kaggle.json', 'credentials', 'config', 'cloudai-7ab42a4d8a43.json']

### マジック・コマンド

頭に `%` をつけて呼び出すマジック・コマンドがあって、セル全体をマジック・コマンドにする場合には `%%`、一行だけの場合は `%` 1つでよい。もしそのコマンドがネームスペースで定義されていない場合は、`%`を省略することもできる。


### %timeit

In [18]:
too_many_decimals = 1.912345897

print("built in Python Round")
%timeit round(too_many_decimals, 2)


built in Python Round
The slowest run took 95.04 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 461 ns per loop


### %alias

`bash` と文法がちょっと違う。

In [13]:
alias lscsv ls -l sample_data/*.csv 

In [14]:
lscsv

-rw-r--r-- 1 root root   301141 Jul 30 16:30 sample_data/california_housing_test.csv
-rw-r--r-- 1 root root  1706430 Jul 30 16:30 sample_data/california_housing_train.csv
-rw-r--r-- 1 root root 18289443 Jul 30 16:30 sample_data/mnist_test.csv
-rw-r--r-- 1 root root 36523880 Jul 30 16:30 sample_data/mnist_train_small.csv


### %who

`Unix` では、ユーザー名を出すが、マジック・コマンドでは使われている変数を返す。

In [15]:
var1=1

In [45]:
who

GoogleCredentials	 Use_Python	 alt	 auth	 base	 cars	 data	 df	 drive	 
files	 gc	 gspread	 interval	 pd	 sys	 too_many_decimals	 uploaded	 var	 
var1	 


In [46]:
too_many_decimals

1.912345897

### %writefile

これを冒頭に書いたセルのそれ以降の行はファイルに保存さえれるので、これは便利。

たぶん、セッション単位で消えてしまうけど、ノートブック上で保存されているのでもう一度実行すればできる。

In [23]:
%%writefile magic_stuff.py
import pandas as pd
df = pd.read_csv(
    "https://raw.githubusercontent.com/noahgift/food/master/data/features.en.openfoodfacts.org.products.csv")
df.drop(["Unnamed: 0", "exceeded", "g_sum", "energy_100g"], axis=1, inplace=True) #drop two rows we don't need
df = df.drop(df.index[[1,11877]]) #drop outlier
df.rename(index=str, columns={"reconstructed_energy": "energy_100g"}, inplace=True)
print(df.head())

Writing magic_stuff.py


In [None]:
cat magic_stuff.py

import pandas as pd
df = pd.read_csv(
    "https://raw.githubusercontent.com/noahgift/food/master/data/features.en.openfoodfacts.org.products.csv")
df.drop(["Unnamed: 0", "exceeded", "g_sum", "energy_100g"], axis=1, inplace=True) #drop two rows we don't need
df = df.drop(df.index[[1,11877]]) #drop outlier
df.rename(index=str, columns={"reconstructed_energy": "energy_100g"}, inplace=True)
print(df.head())

In [26]:
!python magic_stuff.py

   fat_100g  ...                         product
0     28.57  ...  Banana Chips Sweetened (Whole)
2     57.14  ...          Organic Salted Nut Mix
3     18.75  ...                  Organic Muesli
4     36.67  ...                   Zen Party Mix
5     18.18  ...            Cinnamon Nut Granola

[5 rows x 7 columns]


In [27]:
!pip install -q pylint

[K     |████████████████████████████████| 327kB 2.8MB/s 
[K     |████████████████████████████████| 215kB 12.0MB/s 
[K     |████████████████████████████████| 51kB 5.3MB/s 
[K     |████████████████████████████████| 61kB 6.8MB/s 
[K     |████████████████████████████████| 747kB 12.2MB/s 
[?25h

In [28]:
!pylint magic_stuff.py

************* Module magic_stuff
magic_stuff.py:3:0: C0301: Line too long (109/100) (line-too-long)
magic_stuff.py:4:0: C0301: Line too long (110/100) (line-too-long)
magic_stuff.py:5:24: C0326: Exactly one space required after comma
df = df.drop(df.index[[1,11877]]) #drop outlier
                        ^ (bad-whitespace)
magic_stuff.py:7:0: C0304: Final newline missing (missing-final-newline)
magic_stuff.py:1:0: C0114: Missing module docstring (missing-module-docstring)

-----------------------------------
Your code has been rated at 1.67/10



### %%bash

このセルコマンド以降の行は `bash` のコマンドとして実行される。

次の例で、`uname -a` はシステム情報を入手するコマンド。

`ls` はファイル一覧。

`ps` はプロセス一覧。

それぞれ `!` でシェル・コマンドとして呼び出すのを一度にできる。

In [None]:
%%bash
uname -a
ls
ps

In [29]:
!uname -a

Linux ee0f14db3d7c 4.19.104+ #1 SMP Wed Feb 19 05:26:34 PST 2020 x86_64 x86_64 x86_64 GNU/Linux


### %%html

セル・コマンドは大文字でも小文字でもいいのだろうか。

いま `%lsmagic` でリストしてみてみたら、どうやら `%%HTML` だけが大文字のコマンドが用意されていて、ほかはぜんぶ小文字みたい。

In [48]:
%%HTML
<h1>Only The Best Tags and People</h>

### Upload to Colab

つぎの `gspread` というのはグーグル・スプレッドシートで、もうエクセルなんかいらない、という世界らしいが、よくわからない。

`auth` もちょっと危険なのでとりあえずパス。

In [None]:
!pip install --upgrade  gspread

In [None]:
# from google.colab import auth
# auth.authenticate_user()
# 
# import gspread
# from oauth2client.client import GoogleCredentials
# 
# gc = gspread.authorize(GoogleCredentials.get_application_default())
# 
# worksheet = gc.open('Your spreadsheet name').sheet1
# 
# # get_all_values gives a list of rows.
# rows = worksheet.get_all_values()
# print(rows)
# 
# # Convert to a DataFrame and render.
# import pandas as pd
# pd.DataFrame.from_records(rows)

In [33]:
# load an example dataset
from vega_datasets import data
cars = data.cars()

# plot the dataset, referencing dataframe column names
import altair as alt
alt.Chart(cars).mark_bar().encode(
  x='mean(Miles_per_Gallon)',
  y='Origin',
  color='Origin'
)

Upload your file here (request.csv)

ほう。自分の環境からファイルをアップロードできる。

# いまここ

In [52]:
from google.colab import files
uploaded = files.upload()

Saving takotako.ly to takotako.ly


In [53]:
!ls -l

total 16
-rw-r--r-- 1 root root 2657 Aug  1 04:40 adc.json
-rw-r--r-- 1 root root  407 Aug  1 04:13 magic_stuff.py
drwxr-xr-x 1 root root 4096 Jul 30 16:30 sample_data
-rw-r--r-- 1 root root  430 Aug  1 06:36 takotako.ly


In [35]:
# load an example dataset
from vega_datasets import data
cars = data.cars()

import altair as alt

interval = alt.selection_interval()

base = alt.Chart(cars).mark_point().encode(
  y='Miles_per_Gallon',
  color=alt.condition(interval, 'Origin', alt.value('lightgray'))
).properties(
  selection=interval
)

base.encode(x='Acceleration') | base.encode(x='Horsepower')

In [36]:
!ls -l

total 12
-rw-r--r-- 1 root root 2657 Aug  1 04:40 adc.json
-rw-r--r-- 1 root root  407 Aug  1 04:13 magic_stuff.py
drwxr-xr-x 1 root root 4096 Jul 30 16:30 sample_data


In [None]:
import pandas as pd
df = pd.read_csv("automl-tables_notebooks_census_income.csv")
df.head(2)

In [38]:
# load an example dataset
from vega_datasets import data
cars = data.cars()

# plot the dataset, referencing dataframe column names
import altair as alt
alt.Chart(cars).mark_bar().encode(
  x=alt.X('Miles_per_Gallon', bin=True),
  y='count()',
)

In [39]:
import pandas as pd
df = pd.read_csv("https://raw.githubusercontent.com/noahgift/sugar/master/data/education_sugar_cdc_2003.csv")
df.head()

Unnamed: 0,State,Employed,Not employed,Retired,<High school,High school,Some college,College graduate
0,Alaska,26.2 (23.6–28.9),32.1 (27.8–36.8),16.0 (12.6–20.2),47.1 (37.8–56.5),34.9 (31.1–38.9),24.2 (21.0–27.8),12.9 (10.5–15.7)
1,Arizona,33.0 (28.5–37.8),28.7 (23.5–34.5),13.8 (10.8–17.5),40.4 (30.9–50.7),36.5 (30.7–42.7),24.4 (19.9–29.4),14.6 (11.6–18.3)
2,California,22.9 (20.9–25.1),30.2 (27.1–33.4),15.0 (12.2–18.2),38.5 (34.2–43.0),29.9 (26.5–33.7),21.4 (18.8–24.2),11.5 (9.8–13.5)
3,Connecticut,18.9 (17.1–20.9),24.3 (20.8–28.2),15.0 (12.7–17.7),27.8 (22.4–33.9),26.9 (23.7–30.3),19.9 (17.2–23.0),10.2 (8.7–12.0)
4,District of Columbia,18.5 (15.7–21.7),34.6 (29.5–40.1),18.5 (15.3–22.1),45.6 (36.4–55.2),39.0 (33.1–45.2),28.9 (23.4–35.0),8.4 (7.0–10.1)


In [40]:
df.describe()

Unnamed: 0,State,Employed,Not employed,Retired,<High school,High school,Some college,College graduate
count,24,24,24,24,24,24,24,24
unique,24,24,24,24,24,24,23,24
top,Maryland,46.5 (43.5–49.6),35.3 (32.6–38.0),15.4 (14.0–17.0),40.0 (32.2–48.3),48.6 (44.8–52.5),24.2 (21.0–27.8),24.0 (21.9–26.2)
freq,1,1,1,1,1,1,2,1


### Forms in Colab

In [41]:
Use_Python = False #@param ["False", "True"] {type:"raw"}

In [42]:
print(f"You select it is {Use_Python} you use Python")

You select it is False you use Python


### Python executable

Can run scripts, REPL and even run python statements with -c flag and semicolon to string together multiple statements

In [None]:
!python -c "import os;print(os.listdir())"

['.config', 'magic_stuff.py', 'gdrive', 'sample_data']


In [None]:
!ls -l

total 712
drwx------ 4 root root   4096 Sep  9 16:16 gdrive
-rw-r--r-- 1 root root    407 Sep  9 16:20 magic_stuff.py
-rw-r--r-- 1 root root 712814 Sep  9 16:25 pytorch.pptx
drwxr-xr-x 1 root root   4096 Aug 27 16:17 sample_data


In [43]:
!pip install -q yellowbrick

In [44]:
#this is how you capture input to a program
import sys;sys.argv

['/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py',
 '-f',
 '/root/.local/share/jupyter/runtime/kernel-72661fff-1d76-421c-8c1b-489a25bd7fa9.json']

### Introductory Concepts
*  **Procedural Statements**
*  Strings and String Formatting
*  Numbers and Arithmetic Operations
*  Data Structures



 ### Procedural Statements
 Procedural statements are literally statements that  can be issued one line at a time.  Below are types of procedural statements.  These statements can be run in:
 * Jupyter Notebook
 * IPython shell
 * Python interpreter
 * Python scripts

**Printing**

In [None]:
print("Hello world")

Hello world


**Create Variable and Use Variable**

In [None]:
variable = "armbar"
variable

'armbar'

**Multiple procedural statements**

In [None]:
attack_one = "kimura"
attack_two = "arm triangle"
print("In Brazilian Jiu Jitsu a common attack is a:", attack_one)
print("Another common attack is a:", attack_two)

In Brazilian Jiu Jitsu a common attack is a: kimura
Another common attack is a: arm triangle


**Adding Numbers**

In [None]:
1+1

2

**Adding Phrases**

In [None]:
"arm" + " bar"+" 4"+" morestuff " + "lemon"

'arm bar 4 morestuff lemon'

**Complex statements**

More complex statements can be created that use data structures like the belts variable, which is a list.

In [None]:
belts = ["white", "blue", "purple", "brown", "black"]
for belt in belts:
    if "black" in belt:
        print("The belt I want to be is:", belt)
    else:
        print("This is not the belt I want to end up at:", belt)

This is not the belt I want to end up at: white
This is not the belt I want to end up at: blue
This is not the belt I want to end up at: purple
This is not the belt I want to end up at: brown
The belt I want to be is: black


### Strings and String Formatting

Strings are a sequence of characters and they are often programmatically formatted.  Almost all Python programs have strings because they can be used to send messages to users who use the program.  When creating strings there are few core concepts to understand:

* Strings can be create with the single, double and triple/double quotes
* Strings are can be formatted
* One complication of strings is they can be encoded in several formats including unicode
* Many methods are available to operate on strings.  In an editor or IPython shell you can see these methods by tab completion: 
```
basic_string.
            capitalize()   format()       islower()      lower()        rpartition()   title()         
            casefold()     format_map()   isnumeric()    lstrip()       rsplit()       translate()     
            center()       index()        isprintable()  maketrans()    rstrip()       upper()         
            count()        isalnum()      isspace()      partition()    split()        zfill()         
            encode()       isalpha()      istitle()      replace()      splitlines()                  
            endswith()     isdecimal()    isupper()      rfind()        startswith()                  
            expandtabs()   isdigit()      join()         rindex()       strip()                       
            find()         isidentifier() ljust()        rjust()        swapcase()        
```

In [None]:
my_string = "this is a string I am using this time"
my_string.split()
#my_string.upper()
#my_string.title()
#my_string.count("this")

['this', 'is', 'a', 'string', 'I', 'am', 'using', 'this', 'time']

In [None]:
my_string.capitalize()

'This is a string i am using this time'

In [None]:
my_string.isnumeric()



False

In [None]:
print(my_string)
var2 = my_string.swapcase()
print(var2)
print(var2.swapcase())

this is a string I am using this time
THIS IS A STRING i AM USING THIS TIME
this is a string I am using this time


**Basic String**

In [None]:
basic_string = "Brazilian Jiu Jitsu"

**Splitting String**

Turn a string in a list by splitting on spaces, or some other thing

In [None]:
#split on spaces (default)
basic_string.split()

['Brazilian', 'Jiu', 'Jitsu']

In [None]:
result = basic_string.split()
len(result)

3

In [None]:
#split on hyphen
string_with_hyphen = "Brazilian-Jiu-Jitsu"
string_with_hyphen.split("-")

['Brazilian', 'Jiu', 'Jitsu']

In [None]:
#split on comma
string_with_hyphen = "Brazilian,Jiu,Jitsu"
string_with_hyphen.split(",")

['Brazilian', 'Jiu', 'Jitsu']

**All Capital**

Turn a string into all Capital Letter

In [None]:
basic_string.capitalize()

'Brazilian jiu jitsu'

**Slicing Strings**

Strings can be referenced by length and sliced

In [None]:
#Get the last character
basic_string[-1:]

'u'

In [None]:
len(basic_string[2:])

17

In [None]:
#Get length of string
len(basic_string)

19

In [None]:
basic_string[-18:]

'razilian Jiu Jitsu'

**Strings Can Be Added Together**

In [None]:
basic_string + " is my favorite Martial Art"

'Brazilian Jiu Jitsu is my favorite Martial Art'

In [None]:
items = ["-",1,2,3]
for item in items:
  basic_string += str(item)
basic_string

'Brazilian Jiu Jitsu-123'

In [None]:
"this is a string format: %s" % "wow"

'this is a string format: wow'

**F-Strings Can Be Formatted in More Complex Ways**

One of the best ways to format a string in modern Python 3 is to use f-strings

In [None]:
f'I love practicing my favorite Martial Art, {basic_string}'

'I love practicing my favorite Martial Art, Brazilian Jiu Jitsu-123'

**Strings Can Use Triple Quotes to Wrap**

In [None]:
f"""
This phrase is multiple sentenances long.
There phrase can be formatted like simpler sentances,
for example, I can still talk about my favorite Martial Art {basic_string}
"""

'\nThis phrase is multiple sentenances long.\nThere phrase can be formatted like simpler sentances,\nfor example, I can still talk about my favorite Martial Art Brazilian Jiu Jitsu-123\n'

**Line Breaks Can Be Removed with Replace**

The last long line contained line breaks, which are the **\n** character, and they can be removed by using the replace method

In [None]:
f"""
This phrase is multiple sentenances long.
There phrase can be formatted like simpler sentances,
for example, I can still talk about my favorite Martial Art {basic_string}
""".replace("\n", "|")

'|This phrase is multiple sentenances long.|There phrase can be formatted like simpler sentances,|for example, I can still talk about my favorite Martial Art Brazilian Jiu Jitsu-123|'

### Numbers and Arithmetic Operations

Python is also a built-in calculator. Without installing any additional libraries it can do many simple and complex arithmetic operations.

**Adding and Subtracting Numbers**

In [None]:
steps = (1+1)-1
print(f"Two Steps Forward:  One Step Back = {steps}")

Two Steps Forward:  One Step Back = 1


**Multiplication with Decimals**

Can use float type to solve decimal problems

In [None]:
body_fat_percentage = 0.10
weight = 200
fat_total = body_fat_percentage * weight
print(f"I weight 200lbs, and {fat_total}lbs of that is fat")

I weight 200lbs, and 20.0lbs of that is fat


Can also use Decimal Library to set precision and deal with repeating decimal


In [None]:
from decimal import (Decimal, getcontext)

getcontext().prec = 6
Decimal(1)/Decimal(3)



Decimal('0.333333')

**Using Exponents**

Using the Python math library it is straightforward to call 2 to the 3rd power

In [None]:
import math
math.pow(2,4)

16.0

Can also use built in exponent operator to accomplish same thing

In [None]:
2**3

8

In [None]:
2**4

16

this is regular multiplication

In [None]:
2*3

6

**Converting Between different numerical types**

There are many numerical forms to be aware of in Python.
A couple of the most common are:

* Integers
* Floats

In [None]:
number = 100
num_type = type(number).__name__
print(f"{number} is type [{num_type}]")

100 is type [int]


In [None]:
number = float(100)
num_type = type(number).__name__
print(f"{number} is type [{num_type}]")

100.0 is type [float]


In [None]:
num2 = 100.20
type(num2)

float

In [None]:
class Foo:pass
f = Foo()

In [None]:
type(f)

__main__.Foo

**Numbers can also be rounded**

Python Built in round 

In [None]:
too_many_decimals = 1.912345897
round(too_many_decimals, 6)
#get more info
#round?

1.912346

Numpy round

In [None]:
import numpy as np
np.round(too_many_decimals, 6)

1.912346

Pandas round

In [None]:
import pandas as pd
df = pd.DataFrame([too_many_decimals], columns=["A"], index=["first"])
df.round(2)


Unnamed: 0,A
first,1.91


Simple benchmark of all three (**Python**, **numpy** and **Pandas** round):   using **%timeit**

*Depending on what is getting rounded (i.e. a very large DataFrame, performance may very, so knowing how to benchmark performance is important with round) *


In [None]:
print("built in Python Round")
%timeit round(too_many_decimals, 2)

print("numpy round")
%timeit np.round(too_many_decimals, 2)

print("Pandas DataFrame round")
%timeit df.round(2)

built in Python Round
The slowest run took 19.75 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 474 ns per loop
numpy round
The slowest run took 8.19 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 10.9 µs per loop
Pandas DataFrame round
1000 loops, best of 3: 1.03 ms per loop


### Data Structures
Python has a couple of core Data Structures that are used very frequently

* Lists
* Dictionaries

Dictionaries and lists are the real workhorses of Python, but there are also other Data Structers like tuples, sets, Counters, etc, that are worth exploring too.

### Python Dictionaries

The workhorse of Python datastructures

### Creating Python Dictionaries

Creating Python Dictionaries can be done with* brackets {}*

In [None]:
#bad_dictionary = {[2]:"one"}

In [None]:
new_dictionary = {"one":1}

In [None]:
submissions = {"armbar": "upper_body", 
               "arm_triangle": "upper_body", 
               "heel_hook": "lower_body", 
               "knee_bar": "lower_body"}
#type(submissions)
#submissions.items?
submissions

{'arm_triangle': 'upper_body',
 'armbar': 'upper_body',
 'heel_hook': 'lower_body',
 'knee_bar': 'lower_body'}

In [None]:
new_dict =dict(upper_body="lower_body")
new_dict

{'upper_body': 'lower_body'}

### Using Python Dictionaries
A common dictionary usage pattern is to *iterate* on a dictionary by using the items method. In the example below the key and the value are printed:

In [None]:
#submissions.items?


In [None]:
for submission, body_part in submissions.items():
    print(f"The {submission} is an attack on the {body_part}")

The armbar is an attack on the upper_body
The arm_triangle is an attack on the upper_body
The heel_hook is an attack on the lower_body
The knee_bar is an attack on the lower_body


Dictionaries can also be used to *filter*.  In the example below, only the submission attacks on the lower body are displayed:

In [None]:
for _, body_parts in submissions.items():
  print(body_parts)

upper_body
upper_body
lower_body
lower_body


In [None]:
print(f"These are lower_body submission attacks in Brazilian Jiu Jitsu:")
for submission, body_part in submissions.items():
    if body_part == "lower_body":
        print(submission)

These are lower_body submission attacks in Brazilian Jiu Jitsu:
heel_hook
knee_bar


Dictionary keys and values can also be selected with built in *keys() * and *values()* methods

In [None]:
print(f"These are keys: {submissions.keys()}")
print(f"These are values: {submissions.values()}")

These are keys: dict_keys(['armbar', 'arm_triangle', 'heel_hook', 'knee_bar'])
These are values: dict_values(['upper_body', 'upper_body', 'lower_body', 'lower_body'])


Key lookup is very performant, and one of the most common ways to use a dictionary.

In [None]:
if "armbar" in submissions:
  print("found key")
  

found key


In [None]:
print("timing key membership")
%timeit if "armbar" in submissions: pass 

timing key membership
The slowest run took 49.76 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 37.2 ns per loop


### Python Lists

Lists are also very commonly used in Python. They allow for sequential collections. Lists can hold dictionaries, just as dictionaries can hold lists.

### Creating Lists

One way to create lists is with *[] syntax*

In [None]:
list_of_bjj_positions = ["mount", "full-guard", "half-guard", 
                         "turtle", "side-control", "rear-mount", 
                         "knee-on-belly", "north-south", "open-guard"]

Another method os creating lists is with built in *list()* method


In [None]:
bjj_dominant_positions = list()
bjj_dominant_positions.append("side-control")
bjj_dominant_positions.append("mount")
bjj_dominant_positions


['side-control', 'mount']

Yet another way, very performant way to create lists is to use list comprehension syntax

In [None]:
guards = "full, half, open"
guard_list = [f"{guard}-guard" for guard in guards.split(",")]
guard_list


['full-guard', ' half-guard', ' open-guard']

### Using Lists

For loops are one of the simplist ways to use a list.

In [None]:
for position in list_of_bjj_positions:
    if "open" in position: #explore on your own "guard"
        print(position)

open-guard


Lists can also be used to select elements by slicing.

In [None]:
print(f'First position: {list_of_bjj_positions[:1]}')
print(f'Last position: {list_of_bjj_positions[-1:]}')
print(f'First three positions: {list_of_bjj_positions[0:3]}')

First position: ['mount']
Last position: ['open-guard']
First three positions: ['mount', 'full-guard', 'half-guard']


Lists can also be used to unpack powerful, succinct statements when used with built-in functions like zip.


In [None]:
bjj_position_matrix = [
    ["dominant", "top-mount", "back-mount", "side-control"],
    ["neutral", "open-guard", "full-guard", "standing"],
    ["weak", "turtle", "bottom-back-mount", "bottom-mount"]
]
list(zip(*bjj_position_matrix))

[('dominant', 'neutral', 'weak'),
 ('top-mount', 'open-guard', 'turtle'),
 ('back-mount', 'full-guard', 'bottom-back-mount'),
 ('side-control', 'standing', 'bottom-mount')]

In [None]:
zip?

### Python Sets

Sets are unordered unique collections

### Creating Python Sets

Sets can be created by using built-in *sets()* method


In [None]:
unique_attacks = set(("armbar","armbar", "armbar", "kimura", "kimura"))
print(type(unique_attacks))
unique_attacks

<class 'set'>


{'armbar', 'kimura'}

### Using Sets

One of the most powerful ways to use sets is to find the differences between to collections

In [None]:
attacks_set_one = set(("armbar", "kimura", "heal-hook"))
attacks_set_two = set(("toe-hold", "knee-bar", "heal-hook"))
unique_set_one_attacks = attacks_set_one - attacks_set_two
print(f"Unique Set One Attacks {unique_set_one_attacks}")


Unique Set One Attacks {'armbar', 'kimura'}


Question:  

Q: set() is used to select unique values. what is its performance for a deep learning large data sets. in large data sets, if set() is not performant enough, what are the alternatives?

## Part 1.2 Functions 

* [Read related material covered in Chapter 1 (Functions Section) of Pragmatic AI](https://www.safaribooksonline.com/library/view/pragmatic-ai-an/9780134863924/ch01.xhtml#ch01lev1sub17)

* [Watch video section 2:  Writing and Applying Functions](https://www.safaribooksonline.com/videos/essential-machine-learning/9780135261118/9780135261118-EMLA_01_02_00) 

*  **Writing Functions**
*  Function arguments:  positional, keyword
*  Functional Currying:  Passing uncalled functions
*  Functions that Yield
*  Decorators:  Functions that wrap other functions
*  Making Classes Behave Like Functions
*  Applying a Function to a Pandas DataFrame
*  Writing Lambdas

### Writing Functions
Learning to write a function is the most fundamental skill to learn in Python.  With a basic mastery of functions, it is possible to have an almost full command of the language.

**Simple function**

The simplest functions just return a value.

In [None]:
def favorite_martial_art():
    return "bjj"

In [None]:
print(favorite_martial_art())
# This is the same output
my_variable = "bjj"
my_variable

bjj


'bjj'

In [None]:
def myfunc():pass

In [None]:
res = myfunc()
print(res)
#result = myfunc()
#print(result)

None


**Documenting Functions**

It is a very good idea to document functions.  
In Jupyter Notebook and IPython docstrings can be viewed by referring to the function with a ?.  ie.

```
In [2]: favorite_martial_art_with_docstring?
Signature: favorite_martial_art_with_docstring()
Docstring: This function returns the name of my favorite martial art
File:      ~/src/functional_intro_to_python/<ipython-input-1-bef983c31735>
Type:      function
```

In [None]:
def favorite_martial_art_with_docstring():
    """This function returns the name of my favorite martial art
    This is more
    This is even more
    return "string"
    """
    return "bjj"

**Docstrings of functions can be printed out by referring to *```__doc__```*** 

In [None]:
#favorite_martial_art_with_docstring.__doc__
favorite_martial_art_with_docstring?


In [None]:
#favorite_martial_art_with_docstring?

### Function arguments: positional, keyword

A function is most useful when arguments are passed to the function. New values for times are processed inside the function. This function is also a 'positional' argument, vs a keyword argument. Positional arguments are processed in the order they are created in.

In [None]:
def practice(times):
    print(f"I like to practice {times} times a day")

In [None]:
practice(2)

I like to practice 2 times a day


In [None]:
practice(3)

I like to practice 3 times a day


**Positional Arguments are processed in order**

Note, *position* is the key to pay attention to.



In [None]:
def practice(times, technique, duration):
    print(f"I like to practice {technique}, {times} times a day, for {duration} minutes")

In [None]:
practice(3, "piano", 45)

I like to practice piano, 3 times a day, for 45 minutes


In [None]:
#Order is important, now the entire is incorrect and prints out nonsense
practice("piano", 7,60)

I like to practice 7, piano times a day, for 60 minutes


**Keyword Arguments are processed by key, value and can have default values**

One handy feature of keyword arguments is that you can set defaults and only change the defaults you want to change.

In [None]:
def practice(times=2, technique="python", duration=60):
    print(f"I like to practice {technique}, {times} times a day, for {duration} minutes")

In [None]:
practice()

I like to practice python, 2 times a day, for 60 minutes


In [None]:
practice(duration=90, times=4)

I like to practice python, 4 times a day, for 90 minutes


*****args and ****kwargs

allow dynamic argument passing to functions
Should be used with discretion because it can make code hard to understand

In [None]:
def attack_techniques(**kwargs):
    """This accepts any number of keyword arguments"""
    
    for name, attack in kwargs.items():
        print(f"This is an attack I would like to practice: {attack}")

In [None]:
attack_techniques(arm_attack="kimura", 
                  leg_attack="straight_ankle_lock", 
                  neck_attack="arm_triangle",
                 body_attack="charge")

This is an attack I would like to practice: kimura
This is an attack I would like to practice: straight_ankle_lock
This is an attack I would like to practice: arm_triangle
This is an attack I would like to practice: charge


In [None]:
#I also can pass as many things as I wants
attack_techniques(arm_attack="kimura", 
                  leg_attack="straight_ankle_lock", 
                  neck_attach="arm_triangle",
                  attack4="rear nake choke", attack5="key lock")

This is an attack I would like to practice: kimura
This is an attack I would like to practice: straight_ankle_lock
This is an attack I would like to practice: arm_triangle
This is an attack I would like to practice: rear nake choke
This is an attack I would like to practice: key lock


**passing dictionary of keywords to function**

**kwargs syntax can also be used to pass in arguments all at once

In [None]:
attacks = {"arm_attack":"kimura", 
           "leg_attack":"straight_ankle_lock", 
           "neck_attach":"arm_triangle"}

In [None]:
attack_techniques(**attacks)

This is an attack I would like to practice: kimura
This is an attack I would like to practice: straight_ankle_lock
This is an attack I would like to practice: arm_triangle


**Passing Around Functions**

Object-Oriented programming is a very popular way to program, but it isn't the only style available in Python. For concurrency and for Data Science, functional programming fits as a complementary style.

In the example, below a function can be used inside of another function by being passed into the function itself as an argument.

In [None]:
def attack_location(technique):
    """Return the location of an attack"""
    
    attacks = {"kimura": "arm_attack",
           "straight_ankle_lock":"leg_attack", 
           "arm_triangle":"neck_attach"}
    if technique in attacks:
        return attacks[technique]
    return "Unknown"

In [None]:
attack_location("kimura")

'arm_attack'

In [None]:
attack_location("bear hug")

'Unknown'

In [None]:
def multiple_attacks(attack_location_function):
    """Takes a function that categorizes attacks and returns location"""
    
    new_attacks_list = ["rear_naked_choke", "americana", "kimura"]
    for attack in new_attacks_list:
        attack_location = attack_location_function(attack)
        print(f"The location of attack {attack} is {attack_location}")

In [None]:
multiple_attacks(attack_location)

The location of attack rear_naked_choke is Unknown
The location of attack americana is Unknown
The location of attack kimura is arm_attack


### Closures and Functional Currying

Closures are functions that contain other nested functions with state from outer function.

In Python, a common way to use them is to keep track of the state. In the example below, the outer function, attack_counter keeps track of counts of attacks. The inner fuction attack_filter uses the "nonlocal" keyword in Python3, to modify the variable in the outer function.

This approach is called "functional currying". It allows for a specialized function to be created from general functions. As shown below, this style of function could be the basis of a simple video game or maybe for the statistics crew of a mma match.

In [None]:
#nonlocal cannot modify this variable
#lower_body_counter=5
def attack_counter():
    """Counts number of attacks on part of body"""
    lower_body_counter = 0
    upper_body_counter = 0
    #print(lower_body_counter)
    def attack_filter(attack):
        nonlocal lower_body_counter
        nonlocal upper_body_counter
        attacks = {"kimura": "upper_body",
           "straight_ankle_lock":"lower_body", 
           "arm_triangle":"upper_body",
            "keylock": "upper_body",
            "knee_bar": "lower_body"}
        if attack in attacks:
            if attacks[attack] == "upper_body":
                upper_body_counter +=1
            if attacks[attack] == "lower_body":
                lower_body_counter +=1
        print(f"Upper Body Attacks {upper_body_counter}, Lower Body Attacks {lower_body_counter}")
    return attack_filter

In [None]:
fight = attack_counter()

In [None]:
fight("kimura")

Upper Body Attacks 1, Lower Body Attacks 0


In [None]:
fight("knee_bar")

Upper Body Attacks 1, Lower Body Attacks 1


In [None]:
fight("keylock")

Upper Body Attacks 2, Lower Body Attacks 1


### Partial Functions

Useful to partial assign default values to functions

In [None]:
from functools import partial

def multiple_attacks(attack_one, attack_two):
  """Performs two attacks"""
  
  print(f"First Attack {attack_one}")
  print(f"Second Attack {attack_two}")
  
attack_this = partial(multiple_attacks, "kimura")
type(attack_this)

functools.partial

By using this partial function, only one argument is needed

In [None]:
attack_this("knee-bar")

First Attack kimura
Second Attack knee-bar


Alternately, the original function can also be called with a different two attacks

In [None]:
multiple_attacks("Darce Choke", "Bicep Slicer")

First Attack Darce Choke
Second Attack Bicep Slicer


### Lazy Evaluated Functions (Generators)

A very useful style of programming is "lazy evaluation". A generator is an example of that. Generators yield an items at a time.

The example below return an "infinite" random sequence of attacks. The lazy portion comes into play in that while there is an infinite amount of values, they are only returned when the function is called.

In [None]:
def lazy_return_random_attacks():
    """Yield attacks each time"""
    import random
    attacks = {"kimura": "upper_body",
           "straight_ankle_lock":"lower_body", 
           "arm_triangle":"upper_body",
            "keylock": "upper_body",
            "knee_bar": "lower_body"}
    while True:
        random_attack = random.choices(list(attacks.keys()))
        yield random_attack

In [None]:
attack = lazy_return_random_attacks()

In [None]:
type(attack)

generator

In [None]:
for _ in range(6):
    print(next(attack))

['straight_ankle_lock']
['keylock']
['knee_bar']
['arm_triangle']
['arm_triangle']
['knee_bar']


### Decorators:   Functions that wrap other functions

### Randomized Sleep Decorator

Another useful technique in Python is to use the decorator syntax to wrap one function with another function. In the example below, a decorator is written that adds random sleep to each function call. When combined with the previous "infinite" attack generator, it generates random sleeps between each function call.

In [None]:
def randomized_speed_attack_decorator(function):
    """Randomizes the speed of attacks"""
    
    import time
    import random
    
    def wrapper_func(*args, **kwargs):
        sleep_time = random.randint(0,3)
        print(f"Attacking after {sleep_time} seconds")
        time.sleep(sleep_time)
        return function(*args, **kwargs)
    return wrapper_func

In [None]:
@randomized_speed_attack_decorator
def lazy_return_random_attacks():
    """Yield attacks each time"""
    import random
    attacks = {"kimura": "upper_body",
           "straight_ankle_lock":"lower_body", 
           "arm_triangle":"upper_body",
            "keylock": "upper_body",
            "knee_bar": "lower_body"}
    while True:
        random_attack = random.choices(list(attacks.keys()))
        yield random_attack

In [None]:
for _ in range(5):
    print(next(lazy_return_random_attacks()))

Attacking after 1 seconds
['arm_triangle']
Attacking after 3 seconds
['arm_triangle']
Attacking after 2 seconds
['knee_bar']
Attacking after 2 seconds
['arm_triangle']
Attacking after 2 seconds
['kimura']


### Timing Decorator

Using a decorator to time code is very common

In [None]:
from functools import wraps
from time import time

def timing(f):
    @wraps(f)
    def wrap(*args, **kw):
        ts = time()
        result = f(*args, **kw)
        te = time()
        print(f"fun: {f.__name__}, args: [{args}, {kw}] took: {te-ts} sec")
        return result
    return wrap

Using decorator to time execution of a function

In [None]:
@timing
def some_attacks():
  attack = lazy_return_random_attacks()
  for _ in range(5):
    print(next(attack))
    
some_attacks()
  

Attacking after 3 seconds
['arm_triangle']
['keylock']
['straight_ankle_lock']
['arm_triangle']
['arm_triangle']
fun: some_attacks, args: [(), {}] took: 3.0046119689941406 sec


### Making Classes Behave Like Functions

Creating callable functions

In [None]:
class AttackFinder:
  """Finds the attack location"""
  
  
  def __init__(self, attack):
    self.attack = attack
  
  def __call__(self):
    attacks = {"kimura": "upper_body",
           "straight_ankle_lock":"lower_body", 
           "arm_triangle":"upper_body",
            "keylock": "upper_body",
            "knee_bar": "lower_body"}
    if not self.attack in attacks:
      return "unknown location"
    return attacks[self.attack]
    

In [None]:
my_attack = AttackFinder("kimura")
my_attack()

'upper_body'

### Applying Functions to Pandas DataFrames

The final lesson on functions is to take this knowledge and use it on a DataFrame in Pandas. One of the more fundamental concepts in Pandas is use apply on a column vs iterating through all of the values. An example is shown below where all of the numbers are rounded to a whole digit.

In [None]:
import pandas as pd
iris = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')
iris.head(3)

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa


In [None]:
iris.shape

(150, 5)

In [None]:
iris['rounded_sepal_length'] = iris[['sepal_length']].apply(pd.Series.round)
iris.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species,rounded_sepal_length
0,5.1,3.5,1.4,0.2,setosa,5.0
1,4.9,3.0,1.4,0.2,setosa,5.0
2,4.7,3.2,1.3,0.2,setosa,5.0
3,4.6,3.1,1.5,0.2,setosa,5.0
4,5.0,3.6,1.4,0.2,setosa,5.0


In [None]:
iris.shape

(150, 6)

This was done with a built in function, but a custom function can also be written and applied to a column. In the example below, the values are multiplied by 100. The alternative way to accomplish this would be to create a loop, transform the data and then write it back. In Pandas, it is straightforward and simple to apply custom functions instead.

In [None]:
def multiply_by_100(x):
    """Multiplies by 100"""
    
    res = x * 100
    #print(f"This was passed in {x}, and this result was generated {res}")
    return res
  
  
iris['100x_sepal_length'] = iris[['sepal_length']].apply(multiply_by_100)
iris.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species,rounded_sepal_length,100x_sepal_length
0,5.1,3.5,1.4,0.2,setosa,5.0,510.0
1,4.9,3.0,1.4,0.2,setosa,5.0,490.0
2,4.7,3.2,1.3,0.2,setosa,5.0,470.0
3,4.6,3.1,1.5,0.2,setosa,5.0,460.0
4,5.0,3.6,1.4,0.2,setosa,5.0,500.0


In [None]:
iris["new_column"] = iris[['sepal_length']]
iris.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species,rounded_sepal_length,100x_sepal_length,new_column
0,5.1,3.5,1.4,0.2,setosa,5.0,510.0,5.1
1,4.9,3.0,1.4,0.2,setosa,5.0,490.0,4.9
2,4.7,3.2,1.3,0.2,setosa,5.0,470.0,4.7
3,4.6,3.1,1.5,0.2,setosa,5.0,460.0,4.6
4,5.0,3.6,1.4,0.2,setosa,5.0,500.0,5.0


In [None]:
iris.groupby("species").max()

Unnamed: 0_level_0,sepal_length,sepal_width,petal_length,petal_width,rounded_sepal_length,100x_sepal_length,new_column
species,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
setosa,5.8,4.4,1.9,0.6,6.0,580.0,5.8
versicolor,7.0,3.4,5.1,1.8,7.0,700.0,7.0
virginica,7.9,3.8,6.9,2.5,8.0,790.0,7.9


In [None]:
#iris.apply(pd.Series.round, axis=1)

In [None]:
#def sepal_category(x):
  
#  if x == 4:
#  return "big"

#iris['sepal_category'] = iris[['sepal_width']].apply(sepal_category)
#iris.head()
  

In [None]:
#example of a smarter function
def smart_multiply_by_100(x):
  if x > 5:
    return 1
  return x

inputs = [1,2,6,10]
for input in inputs:
  print(smart_multiply_by_100(input))
  
  

1
2
1
1


### Writing Lambdas

Generally considered to be unnecessary.  A Python lambda is an inline python and it can often lead to confusing code.  


In [None]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


In [None]:
func = lambda x: x**2
func(4)

16

In [None]:
def regular_func(x):
  return x**2

regular_func(4)

16

In [None]:
def regular_func2(x):
  """This makes my variable go to the second power"""
  return x**2

In [None]:
regular_func2(2)

4

In [None]:
import random

In [None]:
random.seed

## Homework Excercises



### Can you Return random attacks without repeating?



* keep track of previous calls and run random again if it finds a previous

```
def lazy_return_random_attacks():
    """Yield attacks each time"""
    import random
    attacks = {"kimura": "upper_body",
           "straight_ankle_lock":"lower_body", 
           "arm_triangle":"upper_body",
            "keylock": "upper_body",
            "knee_bar": "lower_body"}
    while True:
        random_attack = random.choices(list(attacks.keys()))
        yield random_attack
        
```

pseudo code:  if attack in previous_attacks:
                              ...new random
                          random_attack    

In [None]:
def func():
  l = [1,2,3]
  return l
func()
  

[1, 2, 3]

In [None]:
def lazy_func():
  l = [1,2,3]
  for item in l:
    yield item
gen = lazy_func()

In [None]:
next(gen)

StopIteration: ignored

In [None]:
def myfunc():
  return "apple"

def myfunc2():
  return 1

print(myfunc())
print(myfunc2())


apple
1


In [None]:
def practise(times):
    print(f"I like to exercise {times} a day")

SyntaxError: ignored

## FAQ

### Q:  Magic Commands

In [None]:
#shell command (import subprocess)
!pwd

/content


In [None]:
# magic
%%python2
print "hello"

hello


In [None]:
#built into IPython
who

AttackFinder	 Decimal	 Foo	 Use_Python	 alt	 attack	 attack_counter	 attack_location	 attack_one	 
attack_techniques	 attack_this	 attack_two	 attacks	 attacks_set_one	 attacks_set_two	 basic_string	 belt	 belts	 
bjj_dominant_positions	 bjj_position_matrix	 body_fat_percentage	 body_part	 body_parts	 cars	 data	 df	 drive	 
f	 fat_total	 favorite_martial_art	 favorite_martial_art_with_docstring	 fight	 files	 func	 getcontext	 guard_list	 
guards	 iris	 item	 items	 lazy_return_random_attacks	 list_of_bjj_positions	 math	 multiple_attacks	 multiply_by_100	 
my_attack	 my_string	 my_variable	 myfunc	 new_dict	 new_dictionary	 np	 num_type	 number	 
os	 partial	 pd	 position	 practice	 randomized_speed_attack_decorator	 regular_func	 regular_func2	 res	 
some_attacks	 steps	 string_with_hyphen	 submission	 submissions	 sys	 time	 timing	 too_many_decimals	 
unique_attacks	 unique_set_one_attacks	 uploaded	 var1	 variable	 weight	 wraps	 


In [None]:
#built in python function
l = [1]
len(l)
max(l)
min(l)

1

### None

In [None]:
var = None
if var == None:
  print("this is None")

this is None


In [None]:
ll = []
ll == None


False

In [None]:
ll == True

False

In [None]:
def my_func():
  return 1
my_func() == None

False

In [None]:
def my_func2():
  """Returns None"""
  pass
my_func2() == None

True

### Next

In [None]:
def funky():
  l = [1,2,3]
  for item in l:
    yield item
    

In [None]:
f = funky()
next(f)

1

In [None]:
next(f)

2

In [None]:
##check value
val = False
if val == True:
  print("yes")

### Q:  Why are Python Dictionaries so commonly used vs other data structures?

A:  They are highly performant, easy to use, and flexible to program with

### Q: Can use ‘+=1’ to keep adding to the variable instead of replacing it with ‘upper_body_counter = upper_body_counter + 1’?

A:  yes ```var +=1``` is *syntatic sugar*


In [None]:
count = 0
for i in range(3):
  count +=1
  print(count)

1
2
3


In [None]:
count = 3
for i in range(3):
  count -=1
  print(count)

2
1
0


### Q: Does a function with ‘return’ instead of using ‘print’ do the same thing?

A:  No, these are different mostly

In [None]:
#in jupyter notebook, the last line prints automatically
var = "blue"
var

'blue'

In [None]:
def simple_color():
  print("blue")

In [None]:
simple_color()

blue


In [None]:
def simple_color_with_return():
  return "blue"

In [None]:
simple_color_with_return()

'blue'

In [None]:
if simple_color_with_return() == "blue":
  print("green")

green


In [None]:
if simple_color() == "blue":
  print("green")
else:
  print("the function had no return value")

blue
the function had no return value


In [None]:
simple_color() == None
simple_color() == "blue"

blue
blue


False

A function with no Return value in python return None

In [None]:
def some_function_with_no_return():
  var = "blue"

In [None]:
print(some_function_with_no_return())

None


In [None]:
def some_function_with_return():
  var = "blue"
  return var

In [None]:
print(some_function_with_return())

blue


In [None]:
result = some_function_with_return()
print(f"The color that is my favorite is {result}, but I also like green")

The color that is my favorite is blue, but I also like green


### Q: the concept of **args and **Kwargs is unclear 
A:  See below, it allows arbitrary or "N" arguments

In [None]:
def func_kwargs(**kwargs):
  for k,v in kwargs.items():
    print(k)
    print(v)


In [None]:
func_kwargs(one=1)

one
1


In [None]:
func_kwargs(one=1, two=2)

one
1
two
2


### Q: What is a dataframe
A:  Another data structure in Python, but like an excel spreadsheet:  columns and rows.

In [None]:
import pandas as pd
mydict= {}
mylist = []
mydf = pd.DataFrame()