# Lab 1 - Python Basics

The goal of this lab is to work through the basics of python with
a focus on the aspects that are important for datascience and machine learning.

## Working with types

Unlike other languages that you may have worked with in the past
Python does not make the user declare the "types" of variables
(numbers, strings, classes). However, as a programmer it is important
for you to know the differences and how they work.

### Numbers

Numbers are the simplest type we will work with. Most of the time
you can ignore the difference between integers and decimal types.
Python will handle the conversions for you. 

In [1]:
number1 = 10.5
number2 = 20

In [2]:
number3 = number1 + number2
number3

30.5

### Strings

Strings are very easy to use in python. You can
just use quotes to create them. To combine two strings
you simply add them together. 

In [3]:
string1 = "Hello "
string2 = "World"

In [4]:
string3 = string1 + string2
string3

'Hello World'

### Lists

Python has a simple type for multiple values called a list.
This differs slightly from array types in other languages as you
don't need to declare the size of the list. 

In [5]:
list1 = [1, 2, 3]
list2 = [4, 5]

Adding two lists together creates a new list combining the two.

In [6]:
list3 = list1 + list2

## Dictionaries

A dictionary type is used to link a "key" to a "value".
You can have as many keys and values as you want, and they can
be of most of the types that we have seen so far. 

In [7]:
dict1 = {"apple": "red",
         "banana": "yellow"}
dict1

{'apple': 'red', 'banana': 'yellow'}

To access a value of the dictionary, you use the square bracket notation
with the key that you want to access.

In [8]:
dict1["apple"]

'red'

In [9]:
dict1["banana"]

'yellow'

You can also add a new key to the dictionary by setting its value.

In [10]:
dict1["pear"] = "green"
dict1

{'apple': 'red', 'banana': 'yellow', 'pear': 'green'}

## Control Structures

### `if` statements

If statements check for a condition and run the 
code if it is true. In Python you need to indent
the code under the if statement otherwise it will
not run.

In [11]:
number3 = 10 + 75.0

In [12]:
if number3 > 50:
    print("number is greater than 50")

number is greater than 50


In [13]:
if number3 > 100:
    print("number is greater than 100")

You can also have a backup `else` code block that will run if 
the condition is not true.

In [14]:
if number3 > 100:
    print("number is greater than 100")
else:
    print("number is not greater than 100")

number is not greater than 100


### `for` loops

For loops in python are used to step through the items in a list one by

In [15]:
list3

[1, 2, 3, 4, 5]

You indicate a for loop in the following manner. The code will be run 5 times
with the variable `value` taking on a new value each time through the l100p.

In [16]:
for value in list3:
    print("Next value is: ", value)

Next value is:  1
Next value is:  2
Next value is:  3
Next value is:  4
Next value is:  5


## Importing and Reading Docs

Python allows the user to specify their own types to represent
additional properties.  We will use many other types throughout the
class. To use these types we need to `import` them from libraries
that store the code. 

Here are a couple of examples.

## Counters

First we add a line of code to important a new type into our program.
it will often looks something like this.

In [17]:
import collections
# We can then use the type like this.

In [18]:
count = collections.Counter([1, 2, 1, 2, 1, 1, 1])
count

Counter({1: 5, 2: 2})

How did we know that this function would count the items in a list?

We didn't! We had to read the documentation. Mostly this means you
go to Google and you type "how do I count the number of items in a
list in python" You then click on the link from stackoverflow and
some nice person tells you the answer. It won't always be the first
answer but just keep trying until you find it.
https://stackoverflow.com/questions/2161752/how-to-count-the-frequency-of-the-elements-in-an-unordered-list

Another method is to print out the `help` method in your notebook.  

In [19]:
help(collections.Counter)

Help on class Counter in module collections:

class Counter(builtins.dict)
 |  Counter(iterable=None, /, **kwds)
 |  
 |  Dict subclass for counting hashable items.  Sometimes called a bag
 |  or multiset.  Elements are stored as dictionary keys and their counts
 |  are stored as dictionary values.
 |  
 |  >>> c = Counter('abcdeabcdabcaba')  # count elements from a string
 |  
 |  >>> c.most_common(3)                # three most common elements
 |  [('a', 5), ('b', 4), ('c', 3)]
 |  >>> sorted(c)                       # list all unique elements
 |  ['a', 'b', 'c', 'd', 'e']
 |  >>> ''.join(sorted(c.elements()))   # list elements with repetitions
 |  'aaaaabbbbcccdde'
 |  >>> sum(c.values())                 # total of all counts
 |  15
 |  
 |  >>> c['a']                          # count of letter 'a'
 |  5
 |  >>> for elem in 'shazam':           # update counts from an iterable
 |  ...     c[elem] += 1                # by adding 1 to each element's count
 |  >>> c['a']                

This is a bit more complex, but it does tell us some more about how the Counter works. For instance it tells us how to get the most common element in the list.

In [20]:
help(count.most_common)

Help on method most_common in module collections:

most_common(n=None) method of collections.Counter instance
    List the n most common elements and their counts from the most
    common to the least.  If n is None, then list all element counts.
    
    >>> Counter('abracadabra').most_common(3)
    [('a', 5), ('b', 2), ('r', 2)]



In [21]:
count.most_common(1)

[(1, 5)]

More than anything remember this. The best programmers use help the
most! No one wins a prize for memerizing the most functions. If you
want to be a good programmer, learn how to look things up quickly and
ask the most questions. 

## Dates

Another very common type that we will want to handle is the type for
dates. I forgot how this works so let's Google it "how do i get the
current time in python"


The link we get back is here
https://stackoverflow.com/questions/415511/how-to-get-the-current-time-in-python

It tells us we can do it like this,

In [22]:
import datetime
date1 = datetime.datetime.now()
date1

datetime.datetime(2021, 5, 27, 12, 7, 0, 99992)

The format of the output of the line above is telling use the we can
access the day, month, and year of the current date in the following
manner. Here `date1` is a special type but the day, month, and year
are just standard numbers.

In [23]:
date1.day

27

In [24]:
date1.year

2021

In [25]:
date1.month

5

If we want to turn the months into more standard strings we can
do so by making a dictionary.

In [26]:
months = {
    1 : "Jan",
    2 : "Feb",
    3 : "Mar",
    4 : "Apr",
    5 : "May",
    6 : "Jun",
    7 : "Jul",
    8 : "Aug",
    9 : "Sep",
    10: "Oct",
    11 : "Nov",
    12 : "Dec"
}
months

{1: 'Jan',
 2: 'Feb',
 3: 'Mar',
 4: 'Apr',
 5: 'May',
 6: 'Jun',
 7: 'Jul',
 8: 'Aug',
 9: 'Sep',
 10: 'Oct',
 11: 'Nov',
 12: 'Dec'}

In [27]:
months[1]

'Jan'

We can convert it to a month name through a lookup.

In [28]:
months[date1.month]

'May'

If we want to see all the months we walk through them with a `for` loop.

In [29]:
for month in [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]:
    print(months[month])

Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec


Python also includes a nice shortcut for making it easy to write for
loops like this. The command `range` will make a list starting from
a value and stop right before the end value.

In [30]:
for month in range(1, 12 + 1):
    print(months[month])

Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec


## Working with Text

Throughout this class we will work a lot with text.  First this will
be just working with names, but it will quickly move to more complex
text and evantually artificial intelligence over text. 

Text will also be represented with a string type. This is created with
quotes. 

In [31]:
str1 = "A sample string to get started"

Just like with lists, we can make a for loop over strings to get individual letters. 

In [32]:
for letter in str1:
    print(letter)

A
 
s
a
m
p
l
e
 
s
t
r
i
n
g
 
t
o
 
g
e
t
 
s
t
a
r
t
e
d


In [33]:
vowels = ["a", "e", "i", "o", "u"]
for letter in str1:
    if letter in vowels:
        print(letter)

a
e
i
o
e
a
e


However, most of the time it will be better to use one of the built-in
functions in Python. Most of the time it is best to google for these, but
here are some important ones to remember

### Split
Splits a string up into a list of strings based on a separator

In [34]:
str1 = "a:b:c"
list_of_splits = str1.split(":")
list_of_splits[1]

'b'

### Join
Joins a string back together from a list.

In [35]:
str1 = ",".join(list_of_splits)
str1

'a,b,c'

### Replace
Replaces some part of a string. 

In [36]:
original_str = "Item 1 | Item 2 | Item 3"
new_str = original_str.replace("|", ",")
new_str

'Item 1 , Item 2 , Item 3'

In [37]:
new_str = original_str.replace("|", "")
new_str

'Item 1  Item 2  Item 3'

### Contains
Checks if one string contains another  

In [38]:
original_str = "Item 1 | Item 2 | Item 3"
contains1 = "Item 2" in original_str
contains1

True

In [39]:
contains2 = "Item 4" in original_str 
contains2

False

In [40]:
# ### Conversions
# Converts between a string and a number
int1 = int("15")
int1

15

In [41]:
decimal1 = float("15.50")
decimal1

15.5

## Functions

Functions are small snippets of code that you may want to use
multiple times.

In [42]:
def add_man(str1):
    return str1 + "man"

In [43]:
out = add_man("bat")
out

'batman'

Most of the time, functions should not change the variables that
are sent to them. For instance here we do not change the variable `y`.

In [44]:
y = "bat"
out = add_man(y)
out

'batman'

In [45]:
y

'bat'

One interesting aspect of Python is that it lets your pass functions
to functions. For instance, the built-in function `map` is a function
applies another function to each element of a list.

Assume we have a list like this.

In [46]:
word_list = ["spider", "bat", "super"]

If we want a list with `man` added to each we cannot run the following:

Doesn't work:  add_man(word_list)

However, the map function makes this work, by creating a new list.

In [47]:
out = map(add_man, word_list)
out 

<map at 0x7f1348653ca0>

## Exercises

In [48]:
teacher_str = "Sasha Rush,arush@cornell.edu,Roosevelt Island,NYC"

In [49]:
name, email, location, city =  teacher_str.split(",")

Todo - Keyword args