# Agenda

1. Python
    1. Variables and values
    2. Conditionals and comparison
    3. Numbers (integers and floats)
    4. Strings
    5. Lists and tuples
    6. Dictionaries and sets
    7. Files
    8. Functions
    9. Comprehensions
    10. Modules 
2. Pandas
    1. NumPy
    2. Series
    3. Data frames
    4. Loading data from various formats (CSV, Excel, JSON, and basic SQL)
    5. Working with large files
    6. Grouping and pivot tables
    7. Window functions
    8. Sorting
    9. Plotting
    10. PySpark

In [1]:
print('Hello, world!')

Hello, world!


My configuration:

- I run Jupyter on my computer, and it saves every 1-2 minutes.
- In that directory, I'm running `gitautopush`
- The directory is connected to a GitHub repo, to which it pushes automatically every 2-3 minutes

In [3]:
# assignment is done with the = operator
# whatever is on the right should be assigned to the variable on the left
# Python is a dynamic language -- variables don't have specific types -- any variable can contain any value

x = 100

In [4]:
type(x)   # I'm asking -- what kind of value is x containing?

int

In [5]:
x = 'abcde'

type(x)

str

In [6]:
x = 10
y = 20

# let's add these numbers together!
# print lets us display anything we want on the screen

print(x+y)

30


In [7]:
# in Jupyter and *ONLY* in Jupyter, we can just put an expression in our cell
# and if it's on the last line, we'll see the value

x + y

30

# Some more notes

1. A comment in Python starts with `#` and goes to the end of the line. There is no way to do comments on multiple lines. However, if you highlight a few lines and then use control-/, it will comment/uncomment those lines in Jupyter (and many other editors).
2. If I want to execute all of the code in a cell, then I use shift+ENTER.

# Some Jupyter commands and notes

Jupyter has two different modes:

- Edit mode -- the outline is green, and typing actually puts content into the cell (like right now). We can enter edit mode by clicking inside of the cell or by pressing ENTER.
- Command mode -- the outline is blue, and typing sends commands to Jupyter, not to the cell.  We can enter command mode by clicking to the left of the cell or pressing ESC.

Some commands in command mode:

- `c` -- copy the current cell
- `v` -- paste the curent cell
- `x` -- cut the current cell
- `z` -- undo the last cut
- `y` -- turn the cell into a code cell
- `m` -- turn it into a Markdown (documentation) cell
- `a` -- add a new cell above the current one
- `b` -- add a new cell below the current one
- `h` -- help with all commands


This is formatted Python code:

```python
x = 100
print(x * 'a')
```

In [8]:
x = 'abcd'    # here, I'm putting the text in quotes, so it'll be taken literally
y = 'efgh'    # that is known in the programming world as a "string" -- use either single or double quotes

# what happens if I do this?

print(x + y)

abcdefgh


In [9]:
# what about this?

x = 10      # here, I have an integer, 10
y = '20'    # here, I have a string, '20'

x + y

TypeError: unsupported operand type(s) for +: 'int' and 'str'

In [10]:
# I can get input from the user with the "input" function

# when I call input, I'll normally want to pass a string as an argument to it
# that string will be displayed to the user, and then whatever the user enters
# will be returned by input

# in all assignment, the right side executes before the left side
# whatever value is on the right side is assigned to the variable on the left side

name = input('Enter your name: ')

Enter your name: Reuven


In [11]:
print(name)

Reuven


In [12]:
print('Hello, ' + name)

Hello, Reuven


In [13]:
print('Hello, ' + name + '.')

Hello, Reuven.


# Comparison

We've seen that we can assign with `=`.

What if we want to compare two values? We'll have to use another operator, the `==`.  This is the operator that asks the question: Are the two values on either side equal to one another?

This operator will return one of two values: `True` or `False`.  These are known as "boolean" values.

There are a bunch of comparison operators that I can use:

- `==` (equal)
- `!=` (unequal)
- `>` (greater than)
- `<` (less than)
- `>=` (greater than or equal)
- `<=` (less than or equal)



In [14]:
5 > 3

True

In [15]:
100 == 100

True

In [16]:
8 < 2

False

In [18]:
'hello' == 'hello'    # we can compare strings!

True

In [19]:
'hello' > 'goodbye'   # does hello come after goodbye alphabetically

True

In [21]:
'Z' > 'a'    # all capital letters come before all lowercase letters!

False

# Python is case sensitive!

- Variable names are case sensitive; we normally use only lowercase letters and the `_` character in variable names. (You can use digits as well, but not in the first character.)  The variable `x` and the variable `X` are not at all the same thing!

- Strings are generally case sensitive! So if someone gives their name as `'Reuven'` and we check for `'reuven'`, they will not be seen as the same thing.

In [26]:
name = input('Enter your name: ')

# When we use "if" for a comparison:
# (1) no parentheses around the comparison.
# (2) at the end of the line, you put a colon (:)
# (3) after the colon, we have a "block" of code, indented -- traditionally, 4 spaces per level        
# (4) when the indentation ends, the block ends, as well

if name == 'Reuven':
    print('Hello, boss!')
    print('It is so nice to see you again!')

# else is optional!  
# it doesn't take a condition -- it only runs if the "if" was False

else:
    print('Hello, ' + name + '.')

Enter your name: someone else
Hello, someone else.


In [23]:
# print can take multiple arguments
print('a', 'b', 'c', 'd')

a b c d


In [24]:
print('Hello, ', name, '.')

Hello,  Reuven .


In [30]:
# more than two options? We have "elif"!

name = input('Enter your name: ')

if name == 'Reuven':
    print('Hello, boss!')
    print('It is so nice to see you again!')

elif name == 'someone else':
    print('That is not a real name!')

else:
    print('Hello, ' + name + '.')

Enter your name: whatever
Hello, whatever.


In [31]:
# making our strings nicer to read

# we can use what are called "f-strings," short for "format strings" or "fancy strings."

x = 'abcd'
y = 'efgh'

print('x == ' + x + ', and y == ' + y + '.')

x == abcd, and y == efgh.


In [32]:
# it's even worse if I have integers!

x = 10
y = 20

print('x == ' + x + ', and y == ' + y + '.')

TypeError: can only concatenate str (not "int") to str

In [33]:
# an f-string is exactly the same as a regular string *except* that you can put
# variables and other Python values inside of {}

# anything you put there will be turned into a string!

print(f'x == {x} and y == {y}')

x == 10 and y == 20


In [34]:
# interpolation

# f-string info: https://fstring.help/

# What if I want to check multiple things?

For example, if `x` is 10 and `y` is 20, I only want to print something if *both* are `True`.

To do that, I need to use `and`, which returns `True` if the expressions on both its left and right are `True`. Otherwise, it returns `False`.

Similarly, we have `or`, which returns `True` if either expression on its left and right is `True`.  If both are `False`, then it returns `False`.

In [35]:
x = 10
y = 20

# True     and        True   --> True
x == 10    and   y == 20

True

In [36]:
if x == 10    and   y == 20:
    print('yes, both are what you want!')

yes, both are what you want!


In [37]:
# True     and       False  -> False

x == 10    and   y == 55

False

In [40]:
# short-circuit evaluation 

# - if we have an "or" expression, and the first item is True, we don't check the second
# - if we have an "and" expression, and the first item is False, we don't check the second

# False    or       True
x == 55    or   y == 20

True

In [41]:
x = 10

if x == 10:
    print('Yes!')

Yes!


In [43]:
if x:
    print('It is True-ish')
else:
    print('It is False-ish')         
          

It is True-ish


In [44]:
name = 'Reuven'

if name:
    print('It is True-ish')
else:
    print('It is False-ish')         


It is True-ish


# Boolean context

An `if` will *always* look to its right and ask for a boolean value (`True`/`False`).

If there isn't a boolean value there, it asks Python to convert the value that is there to a boolean. Every single value in Python is considered `True` in that context except for a few:

- `False`
- `None` 
- 0 
- anything empty -- an empty string, list, dict, or set

In [45]:
# if I want to check whether the user gave me a name, I can do this:

name = input('Enter your name: ')

if name:   # if it's not the empty string
    print(f'Hello, {name}')
else:
    print('You did not enter a name!')

Enter your name: 
You did not enter a name!


It's considered "Pythonic" to check for an empty string and many other values by just putting a variable inside of an `if` statement.  If we get a `None` or empty-string value, then that'll be considered `False`.

The `not` operator flips the logic on whatever is to its right.

In [46]:
x = 10

not x == 10

False

In [47]:
not x == 20

True

In [None]:
# this is common Python code:

name = input('Enter your name: ')

if not name:   # did we get an empty string?
    print('Why did you not enter a name? ')
    
else:
    print(f'Hello, {name}!')

In [48]:
s = ''

len(s)  #how many characters?

0

In [49]:
if s:
    print('It is True-ish')
else:
    print('It is False-ish')

It is False-ish


In [50]:
s = input('Enter something: ')

Enter something: 


In [51]:
s

''

In [52]:
x = 10

if x:
    print('It is True-ish')
else:
    print('It is False-ish') 

It is True-ish


In [53]:
x = 0

if x:
    print('It is True-ish')
else:
    print('It is False-ish') 

It is False-ish


In [54]:
x = '0'

if x:
    print('It is True-ish')
else:
    print('It is False-ish') 

It is True-ish


# Exercise: Name and company

1. Ask the user to enter their name, and assign it to `name`.
2. Ask the user to enter their company, and assign it to `company`.
3. Print one of four things:
    - If the name and company are the same, say something like, 'You must be me!'
    - If the name is the same but the company is different, say something like, "Great name but you work for a terrible company."
    - If the name is different but the company is the same, say something like, "You are my colleague"
    - If both are different, then say, "I want nothing to do with you."

In [None]:
name = input('Enter your name: ')
company = input('Enter your company: ')

if name == 'Reuven' and company == 'Lerner':
    