# Python Syntax & Fundamentals 
## Part 1: Quick Guide & Definitions

[Click here to open this notebook in your browser](https://leifwalsh.github.io/data-analysis-problem-sets/lab/index.html?path=1-foundations/1.2-jupyter-and-python-syntax/1.2-jupyter-and-python-syntax-part-1.ipynb)

In this section we are going to review a Quick Guide & Definitions:
1. Running Cells
2. Comments
3. Values and a few primitive types
4. Variables
5. `+=` Operators 
6. Arithmetic Operations
7. Numbers in Python
8. Strings
9. `print()` functions
10. Parenthesis and Brackets
11. Variables
12. f-strings
13. Errors
14. Basic data structures


(*sources: Codecademy, and docs.python.org, Leif Walsh, Sarah Crandall*)


## Quick Guide & Definitions

Python is a programming langugae that uses "commands" to communicate with a
computer. We convey our commands to the computer by writing them in a text file
that are called programs. Running a program means telling a computer to read
the text file, translate it to the set of operation that it understands, and
then performs those actions.

The Python programming language is different from the command line language we
saw in [Chapter 1.1](../1.1-command-line/1.1-command-line.ipynb). We're going
to do most of our work in Python.

A very simple Python program is:

```python
print("The sum of 3 and 4 is", 3 + 4)
```

If you have a terminal open, you can run this program:

```bash
$ python -c 'print("The sum of 3 and 4 is", 3 + 4)'
The sum of 3 and 4 is 7
```

### What is a Jupyter Notebook?

Jupyter is an interactive programming environment, which means you'll write a
little code, run it, see what it does, and then write some more code.

In contrast with traditional software development, where you typically write
some code, then run it from the start (or run your tests from the start) each
time you make changes, with Jupyter you're interacting with a live environment.
You can load a bunch of data, run some things, and then make changes without
restarting the process or discarding the data you've already loaded.

This makes iterating on an idea much faster, especially when you're working
with large data sets. But it can also be dangerous, as we'll see: if you aren't
careful about how you structure your code, you can get a bit lost if you change
too many things or forget what order you ran them.

## Running Cells

In Jupyter, you write code in "cells" that contain snippets of code, which you
can run in any order, but you usually run them top to bottom.

To run a cell, click the "play" button, or use Ctrl+Enter to evaluate it, or
Shift+Enter to evaluate it and move to the next cell.

Try running your first cell now. Click on the next box and press Ctrl+Enter.

In [None]:
print("The sum of 3 and 4 is", 3 + 4)

Now change the cell, change one of the numbers, and run it again to see what it
does.

## Comments

A comment is a piece of text within a program that is not executed.  It can be used to provide additional information related to the code. 
- comments can be used to help other people read the code
- ignore a line of code while testing to see how a program will run without it
- provide notes for yourself when experimenting with your code
The # character is used to start a comment, and it continues until the end of the line

Try running the next cell, you'll see it only does one thing:

In [None]:
# this is a comment--anything (like this text) that starts with the octothorpe
# (#) character and continues to the end of the line. Comments are just
# for you and your readers! The comments aren't evaluated as part of the
# program, but everything else is evaluated, one line at a time.

print("this is not a comment") #this is also a comment
# and so is this.

## Values and a few primitive types

So far we've seen a couple of "primitive" types of values in Python:

- `"The sum of 3 and 4 is"` is a "string", which is a weird jargon word that
  means "text". You can use single quotes or double quotes to create a string.
- `3` and `4` are "integers", which are whole numbers.

There are a few other primitive types:

- `3.14` is a "float", which is a number with a decimal point (again with the
  jargon, "float" means "floating point number" and refers to the way the
  computer stores these numbers).
- `True` and `False` are "booleans", which are a special type of value that
  represents truth or falsehood. For example, `3 < 4` is `True`, but `3 > 4` is
  `False`.
- `None` is a special value in Python that represents "nothing".

In Jupyter, whenever a cell ends with some value, it just prints that value
in the output. For example, if you run a cell with just `3` in it, the output
will be `3`. If you run a cell with just `"hello"`, the output will be
`"hello"`. Try running these cells:


In [None]:
3

In [None]:
"hello"

## Variables

A variable is a name that refers to a value. You can think of it as a label
that you stick on a value so you can refer to it later. For example, you might
have a variable `x` that refers to the number `3`. Then you can use `x` in your
program, and it will be the same as if you had written `3`.

You can create a variable by writing a name, then an equals sign, then a value.
For example:

```python
x = 3
```

This creates a variable `x` that refers to the number `3`. You can then use
`x` in your program, and it will be the same as if you had written `3`.

Try this:

In [None]:
x = 3
y = 4
print("The sum of", x, "and", y, "is", x + y)

If you have a cell that just assigns some variables, it doesn't print anything:

In [None]:
w = 5
z = "six"

But after you run that, you can use those names:

In [None]:
w + x

In [None]:
print(z)

Each time you run a cell, the variables you've created are stored in the
working memory of your notebook. You can change what a variable means (just by
using the equals sign again), and this may change what other cells do, if they
use that variable.

Let's try it. Create a variable `a` and set it to `3` in this cell:

Now, print `a * 2` in this cell:

Now change `a` to `4` in this cell (just type `a = 4` and press Ctrl+Enter):

Finally, go back to the cell where you printed `a * 2` and run it again. What
happened?

## Arithmetic

Python has a few arithmetic operations:

- `+` for addition
- `-` for subtraction
- `*` for multiplication
- `/` for division
- `**` for exponentiation
- `%` for modulo (which gives the remainder when you divide one number by
  another)

Try a few:

In [None]:
3 + 4

In [None]:
3 - 1

In [None]:
6 * 7

In [None]:
8 / 4

In [None]:
10 ** 3

In [None]:
100 % 3

You can also add strings together, which is called "concatenation":

In [None]:
"hello" + " " + "world"

You can multiply strings by numbers too, which is kind of weird:

In [None]:
"ba" + "na" * 2

You can't multiply strings by things that aren't integers, though:

In [None]:
"hello" * 3.14

That's an error! We'll come back to how to read that soon.

## += Operator

The += (plus-equals) operator is a convenient way to add a value to an existing
variable, and assign the new value back to the same variable.

Essentially, `x += y` is the same as `x = x + y`.

When the variable and the values are strings, this operator performs
concatenation instead of addition.

This operation is performed in-place, meaning that any other variable which
points to the variable being updated will also be updated.

Let's try it:

In [None]:
counter = 1

In [None]:
print(counter)

In [None]:
counter += 1

In [None]:
print(counter)

In [None]:
counter += counter
print(counter)

In [None]:
# Try running this cell
2 + 2

## Cell Outputs

Each time you run a cell, it executes all the code in it, but only prints
the last value:

In [None]:
# This cell computes two things. You will see that the first one doesn't
# get printed in the output. Just the last line (the last "expression")
# gets printed.

# The sum of the first six Fibonacci numbers:
0 + 1 + 1 + 2 + 3 + 5

# And the sum of the first six primes:
2 + 3 + 5 + 7 + 11 + 13

Can you compute th sum of the numbers 1 through 10 in the next cell?

### Numbers in Python
There are few types of number data types, and multiple ways of storing numbers.  Which one you use depends on your intended purpose for the number you are saving:
- **Integer:** [int] is a whole number.  It has no decimal and contains all counting numbers (1, 2, 3...) as well as their negative counterparts and the number 0.  The number 0 represents an integer value but the same number written as 0.0 would represent a floating point number.  

- **Floating-point number:** [float] is a decimal number.  It can be used to represent fractional quantities as well as precise measurements.  For example, a = 3/5 can not be represented as an integer, so the variable "a" is assigned a floating point value of 0.6

### Strings
Computer programmers refer to blocks of text as strings.  In python a string is either surrounded by double quotes ("Hello World") or single quotes ('Hello World'). 

It doesn't matter if you use single quotes or double quotes, just be consistent.
Depending on the analysis you are completing, you might use [strings] in defining your functions, as arguments to be passed through a function, or dealing with strings as data types. 

In [None]:
# this is a string

"Hello World"

In [None]:
# this is ALSO a string

'Hello World'

### print() Function
the print() function is used to output text, numbers, or other printable information to the console. 

It takes one or more arguments and will output each of the arguments to the console seprated by a space.  
If no arguments are provided, the print() function will output a blank line. 

In [None]:
print("Hello World!")
print(100)
pi = 3.14159
print(pi)

You can print things, multiple times in a cell if you like. 

In [None]:
print(34 + 55)
print("I am a banana!")

### Parentheses

**Parenthesis:** to call a function in Python, type out its name followed by a parentheses().

Can you name a function we've used so far?


**Brackets:** a list begins and ends with square brackets, and either item is separated by a comma, good practice to insert a space after each comma.  A bracket is used for dataframes.  Defining a list always has a list name = [ ]

NOTE: we will cover *Lists* in chapter 3


### Variables
A variable is used to store data that will be used by the program.  You can then use that name again later to refer to the value.  

We assign variables with the `=` symbol: 

- This data can be a number, a string, a Boolean, a list or some other data type.  Every variable has a name which can consist of letters, numbers, and the underscore character `_`. 
- If there is a greeting we want to present, a date we need to reuse, or a user ID we need to remember we can create a variable which can store a value.  
- The equal sign `=` is used to assign a value to a variable.  After the initial assignment is made, the value of a variable can be updated to new values as needed. 
- Variables can't begin with numbers but they can have numbers after the first letter (e.g. `cool_variable_5`). 
- Variables that are assigned numeric values can be treated the same as the number themselves.  Two variables can be added together, divided by 2 and multiplied by a third variable.  Performing arithmetic on variables does not change the variable, you can only update a variable using the `=` signs.  

In [None]:
# these are all valid variable names and assignment
user_name = "leif"
user_id = 100
verified = False

# a variable's value can be changed after assignment
points = 100
points = 120


**Remeber:** We assign variables with the `=` symbol:

In [None]:
x = 101

Then in other cells we can use them:

In [None]:
print("There are", x, "dalmatians")

You can assign as many variables as you like in the same cell, and use them too:

In [None]:
a = 3
b = 2
a * b

You can also change what a variable that's already defined means. This
is where a lot of the "danger" of Jupyter notebooks comes from.

Run this cell, then go back and run the cell about dalmatians again:

In [None]:
x = -1

Hopefully you noticed how, when you ran the dalmatians cell, it printed
something different, because you changed what `x` meant.

You can look at what all the names you've created mean:

In [None]:
print("a =", a)
print("b =", b)
print("x =", x)

### f-Strings
##### A quick note on string formatting:
Now that we have variables and strings, we can talk about "f-strings":
these are strings with the letter `f` in front of them, and they can
expand the values of variables inside them. So, we can talk about
dalmatians again in a new way:

In [None]:
x = 101
print(f"There are {x} dalmatians")

The format expressions inside f-strings are really powerful. We won't
talk about all of them here, but one that's really useful is the "debug
print" one, where you add a `=` character inside the brackets:

In [None]:
print(f"{a=}")
print(f"{b=}")
print(f"{x=}")

You can also format with larger expressions:

In [None]:
print(f"{a * b = }")

## Errors

When programs throw errors that we didn't expect to encounter, we get what's
called a "traceback", and the code will point to the location where the error
occurred with a ^ character. 

Some common errors that we encounter while writing python are:
- **SyntaxError**: means there is something wrong with the way your program is
  written, like punctuation that does not belong, a comment where it is not
  expected, or a missing parentheses can all trigger SyntaxError
- **NameError**: occurs when the Python interpreter sees a word it does not
  recognize. Code that contains something that looks like a variable but was
  never defined will throw a NameError.
- **TypeError**: occurs when you try to combine two objects that are not
  compatible. For example, you can't add a string to an integer, or multiply
  a string by another string.

Let's make some errors:

In [None]:
a +

In [None]:
print(sarah)

In [None]:
"strings can't" * "be multiplied"

## Data Structures

Python comes with a few useful data structures built in. We'll start with just
two you'll use a lot:

- **Lists**: a list is an ordered sequence of things. You can put anything you
  like in a list, and you can put different types of things in the same list.
  You can even put other lists in a list. A list is defined with square
  brackets, and the things in it are separated by commas:

  ```python
  cool_list = ["white vans", "bear", "ice"]
  ```

- **Dictionaries**: sometimes called "dicts" or "maps", it associates pairs of
  things. The first thing in the pair is called the "key", and the second is
  called the "value". You can use the key to look up the value. A dictionary is
  defined with curly braces, and the pairs are separated by commas, with the
  key and value separated by a colon:

  ```python
  cool_dict = {"shoes": "white vans", "animal": "bear", "food": "ice"}
  ```

Let's actually make these data structures:

In [None]:
cool_list = ["white vans", "bear", "ice"]
cool_dict = {"shoes": "white vans", "animal": "bear", "food": "ice"}

In [None]:
cool_list

In [None]:
cool_dict


In both cases, you use `[]` to look up a value.

For a list, you use the index of the thing you want to look up, which is an
integer (the first thing is at index 0, the second at index 1, and so on).

In [None]:
cool_list[1]

For dicts, you use the key you want to look up:

In [None]:
cool_dict["animal"]

If you try to look up something that's not in the list or dict, you'll get an
error:

In [None]:
cool_list[10]

In [None]:
cool_dict["mc hammer"]