# CYPLAN 255, Spring 2024
## Urban Informatics and Data Visualization

**Instructor: Max Gardner** <br>
**GSI: Meiqing Li**

# Lecture 03 -- Python from the Command Line
*******
January 24, 2024

# Agenda
1. Announcements
2. Getting started with GitHub (Cont'd)
3. Python from the Command Line
4. For next time
5. Questions

# 1. Announcements

- Waitlist (good news)
- Camera policy 
- Maptime! 
   - https://www.meetup.com/maptime-sf/events/298643532/
- Error on GitHub cheat sheet:
   - ~~CYPLAN255~~ --> UCB_CYPLAN255_2024

# 2. Git Cont'd

![](https://lh3.googleusercontent.com/pw/ABLVV84wrG-7FUen9lAb3UkfD5Cg_KYFJutSW8RPXf0abaLKHnn4UJXEI0eDIXRQbP0fZviMeEWiEuBF-4YXpyCZ86ZD0rBsA-78Ps0WYxdW_X_XQqLjyYWPIph5rcV3T06kdcK-QwmUCu4NBo1NCEskHQ0oTQ=w3450-h1522-s-no)

## 2.1 Authentication

- Pushing any changes from local to remote repo requires authentication
- Two options for authentication, determined when you clone a repo

     | https + access token (see docs [here](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens))| ssh + keypair (see docs [here](https://docs.github.com/en/authentication/connecting-to-github-with-ssh))|
     |--|--|
     |<img src="https://lh3.googleusercontent.com/pw/ABLVV87l6xAW6wHFiYWdpH_MxmBHqMHyHJPgiPDnYBcWVLCZ5gIONcP24M7NJZiQmUhe69k8ZVxqCSkw1MTYEkJdcYd9I_8RTw9ymaEGPMOkI_rx7MCYR8w8ORoUqICjrm18puL0pZr7YM4p4x3h0VGYxyxe3A=w786-h326-s-no" width=70% align="center"> | <img src="https://lh3.googleusercontent.com/pw/ABLVV86BYqb1hfWmd_UZHVeT5LZGG7aY5D53Pc5uZeWe3xPKJBxD5eFjbPv-8dSgyzk6bPLaWUfd-YEZHJpGMCrhaeQ66qGevKqENycoA1U9qVLkThcgol4yQSKbD7oyltekgM4_H_OfBrOLrX21JlYNqIV2LA=w776-h318-s-no" width=70% align="center"> |

- https vs. ssh
    - access token works like a password, you'll be asked to enter it each time you push a commit
    - ssh does everything behind the scenes after you push your first commit, but requires a bit more work to setup


## 2.2 Syncing your fork

### 2.2.1 The lazy way:
1. Sync from the web GUI:
   <img src="https://lh3.googleusercontent.com/pw/ABLVV85AWws_AzaMnsHy7caMV9u6W9523R1oBbiLIQbRZl6t8cirn0FHyS-DgLFQThPPWgTGCGlQlU-2rxO2uFTbXR4m3RU0hD0aLmEBECMYuYeu7LhTPj3l3idvG5KhCEb1EmoTiyiUYFNTImXooTSpDAqiHA=w1832-h966-s-no" width=60%>
2. `git pull` from your remote to local

NOTE: Doing things the lazy way may cause problems for you down the road, so don't rely on it.

### 2.2.2 The proper way:
1. Add `upstream` as an additional remote for your local copy of the repo:
   - `git remote add upstream https://github.com/mxndrwgrdnr/UCB_CYPLAN255_2024.git`
2. Make sure your on the "main" branch:
   - `git checkout main`
3. "Fetch" any changes to "main" from `upstream`, but don't merge them yet:
   - `git fetch upstream`
4. Merge any changes from `upstream/main` into your local repository:
   - `git merge main`

## 2.3 Tips for avoiding merge conflicts

- Only commit changes from one file at a time:
   - `git add my_script.py` instead of `git add .`
- Create a copy of any file that you don't own, give it a new name, and edit that one, not the original!

# 3. Python from the Command Line
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
NOTE: This portion of the notebook was heavily adapted from previous course material by Prof. Paul Waddell and Samuel Maurer.

## 3.1 Installing Anaconda Python Bundle with Jupyter on Your Own Computer

You should install Python on your own computer for this class to gain experience managing a Python installation and supporting libraries, and to give you more compute resources than you would get using Datahub.  Please use the following installer to make things as consistent as possible with the environment in class.

- https://www.anaconda.com/products/individual

You will find installers for each operating system, and the current default version is 3.9.  Please do not install an earlier version, especially not one below Python 3, since the syntax is very different and many libraries no longer support it.

Python and the Anaconda distribution is free. In fact, all the software we will use in this class is open source and free (as in no cost). Python runs on Windows, OSX, and Linux, so regardless of what computer you are using, it will most likely run on it.   

## 3.2 Python vs. IPython vs. Jupyter
- **Python** - an interpreted, high-level programming language
    - At runtime, Python 'interpreter' reads the command, figures out what the intended computation is, and then executes it. 
    - This differs from low-level languages like C or C++, in which you generally have to _compile_ code before you can run it, and if you find errors they have to be diagnosed and then the code re-compiled before you can run it. 
    - Interpreted languages skip the compile step, and just execute code directly, and if there are errors, they are seen at runtime. 

<img src="https://www.python.org/static/community_logos/python-logo-master-v3-TM.png" width=200% align="center">






## 3.2 Python vs. IPython vs. Jupyter
- IPython - “interactive” Python interpreter 
- Jupyter Notebooks - web-based GUI for IPython


<img src="https://upload.wikimedia.org/wikipedia/commons/3/3c/IPython_Logo.png" width=50% align="center">
<img src="https://miro.medium.com/v2/resize:fit:1400/format:webp/1*b1PpLl1-C8FWTLzNO3OqVA.jpeg" width=20% align="center">

## 3.3 Options for running Python from a terminal

When we write and execute Python code, we generally do that within an environment call as an interpreter. Python interpreters and editing environments can be quite varied. Some options include:
- `python`				--		launch the default Python interpreter
   - `exit()` 			--			exit
   - `<ctrl> + d`   	--		exit (Mac/Linux)
   - `<ctrl> + z` 		--	exit (Windows)
- `ipython`		--		launch the interactive Python interpreter
   - `exit` 	--					exit
   - `<ctrl> + d` --   			exit  (Mac/Linux)
   - `<ctrl> + z` --  			exit (Windows)
- `jupyter notebook` --			launch a notebook server and dashboard
   - `<quit>` --		exit (notebook dashboard)
   - `<ctrl> + c` -- 			exit (Mac/Linux)
- Launch Python from an integrated development environment (IDE) like PyCharm or Visual Studio Code (VSCode)
- `python my_script.py`	--	execute a Python script, interpreted in the background.

## 3.4 Managing packages with virtual environments

### 3.4.1 Python, Anaconda, Conda, etc.

- **Anaconda** – a Python _distribution_ <img src="https://know.anaconda.com/rs/387-XNW-688/images/Anaconda_ForTrademark_HorizontalLarge_white.png" width=50% align="right">
- **Conda** – a Python _package manager_ and _environment manager_
   - Created by the Anaconda folks
   - As a package manager:
      - Installs Python libraries (packages) from package repositories (e.g. conda-forge)
      - Manages dependencies and resolves conflicts
      - Other examples: “pip”
   - As an environment manager:
      - Manages Python virtual environments (sandboxes)
      - Other examples: “virtualenv”
      
      


### 3.4.2 Max’s Tips for Creating a Conda Environment

- `conda create -n my-first-env`  <-- name this whatever you want
- `conda activate my-first-env`
- `conda config --add channels conda-forge`
- `conda config --set channel_priority strict`
- `conda install python ipython notebook nb_conda_kernels`
- optional:
   - `conda install jupyter_contrib_nbextensions`
   - `jupyter contrib nbextension install --user`


## 3.5 Jupyter Notebooks

This first session will cover the basics of Python, and introduce elements that will help you get familiar with Python as an interactive computational environment for exploring data.  The material is presented in an interactive environment that runs within your web browser, called a Jupyter Notebook.  It allows presentation of text and graphics to be combined with Python code that can be run interactively, with the results appearing inline.  We are looking at a Jupyter notebook now.  ~~Note that Jupyter is a relatively recent name for this so sometimes you may still see it referred to as an IPython noteboook.  Jupyter is just the new version of IPython notebooks, but now also supports a variety of other languages and tools.~~  

Let's start by getting familiar with the Jupyter Notebook and how it works.




Once you have a shell, use `cd` (change directory) to navigate to whatever directory you want to work in. You'll need to use the command prompt from the beginning of this course, so get comfortable with basic commands this week if you are not already.

At the command prompt, `cd` to the location of this notebook, and run the following command:
- `jupyter notebook`

This command does two things:
1. launches a Jupyter Notebook Server in the terminal
2. launches the [Notebook Dashboard](https://jupyter-notebook.readthedocs.io/en/stable/ui_components.html#notebook-dashboard) in the browser



### 3.5.1 Using Jupyter Notebooks

From the Notebook Dashboard, you can either load an existing notebook if you see one, create a new one (or open a terminal). If you started the server from the right place, you should see the name of this notebook listed there: **lecture_03_command_line_python.ipynb**. If you click on this notebook, another tab will open in your browser, containing this notebook, ready to use. Go ahead and do that.

A notebook is made of cells. So in this notebook you've only seen cells that contain text. These are markdown cells. Notice the pulldown list for the cell type contains:

* Code -- which we will use for Python code mainly, though it could use other languages
* Markdown -- like this cell, using a flavor of structured text like that is used in Wikipedia and many other platforms
* Other options will appear depending on what else is installed for use with Jupyter, like kernels for Scala, R, Octave, etc.

#### Edit Mode + Markdown
You can edit the contents of a cell by double-clicking on it. The border of the cell will turn from blue to green. You are now in "edit mode". Try it on this cell. When you are ready to save the cell or exit edit mode, just use `<shift> + <enter>`. That's one way to run a cell. We will see how the code cells work next. Before going on to that, read a little bit about how you can format your text cells with Markdown [here](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Working%20With%20Markdown%20Cells.html)

For example, you can render LaTeX equations in your Markdown cells:

- $y = \alpha + \beta X $
- $c = \sqrt{a^2 + b^2}$

Or use the backtick "\`" character to style text like lines of code:
- `math.sqrt(98)`

Cells can also contain lines of interactive code. The next cell contains a Python command. You can run that command in two ways:
1. Select the cell in "command mode" (blue) and type `<shift> + <enter>`
2. Clicking on the Run icon  (left of the black square), on the toolbar. 

Try running the next cell and notice that it executes the command and writes the output below the cell:

In [None]:
import math
math.sqrt(49)

## 3.6 Hello World!

The first programming command demonstrated when you are learning a programming language is usually to make the computer print 'Hello World!'.  In Python, doing this is pretty simple:

In [None]:
print("Hello World!")

As you can see, there is not much code involved in making this happen.  The word 'print' is a command that Python knows how to process, and the text string 'Hello World!' in quotations is an **argument** being passes to the print command.  You can of course pass any kind of argument to the Python print command, and it will try to *do the right thing* without you having to micro-manage the process.

## 3.7 Python as an Interactive Calculator

Python can be used as a simple interactive calculator, by just typing in a mathematical expression as you might on a regular or scientific calculator:

In [None]:
2 - 4

What happened above is that Python interpreted the line `2 - 4` to parse that it should understand the first object it encountered as an integer, the second object as a mathematical operator for addition, and the third as another integer.  Python's interpreter mostly just tries to figure out what you mean when you write statements like this, and as long as it is unambiguous and feasible to compute, it just does it without you having to explain things in detail.

You can of course use any kinds of numbers (e.g. integers or decimals), and any standard mathematical operators, and most of the time you get what you expect:

In [None]:
3.2 * 4

In [None]:
3 ** 4

### A Note on Calculating with Different Data Types

What happens if we perform computations with mixed types?

In [None]:
print(12 + 3)
type(12 + 3)

In [None]:
print(12. + 3)
type(12. + 3)

## 3.8 The Interactive Plotting Environment

OK, so maybe using Python as an interactive calculator is not the most compelling case for using Python, even if it does demonstrate that Python has a very shallow learning curve for someone completely new to programming.  You can actually begin using it productively even before learning how to program in it!

To give a preview of somewhat more advanced topics, let's look at the interactive plotting mode in IPython that we can invoke by using 'magic' commands, and importing some modules: 

In [None]:
#import necessary packages/modules
import pandas as pd, numpy as np, matplotlib.pyplot as plt

This loads pandas and numpy and the matplotlib plotting environment.  We'll come back to these libraries in more detail later, but now let's look at how they allow us to extend the range of things we can do.  Let's assign 100 sequential numbers to a variable labeled x, and create another variable, y, that has some transformation of x, and then plot y against x:

In [None]:
x = range(100)
print(x)
print(list(x))

In [None]:
y = np.sin(x)
print("The first 10 entries in y: {0}".format(y[:10])) 

In [None]:
plt.plot(x, x * y)

Or here is how we could draw 1,000 random numbers from a normal distribution, and plot the results as a frequency histogram:

In [None]:
x = np.random.randn(1000)
_ = plt.hist(x, bins=30)
y = x * 5

## 3.9 Getting Some Help

A couple of things that IPython does to help you be more productive are useful to introduce here. 

One is called **tab-completion**. If you can't quite remember the full name of a function, or it is really long and you don't like to type much, you can type the first few characters, and hit the `<tab>` key, and the options that begin with those first few characters show up on a menu.

You can also use `<shift> + <tab>` to display a **tool-tip** for a function you may have forgotten how to use.

Try these in the cell below:

In [None]:
plt.hist()

The other thing you can make a lot of use of is **help**! If you want to know more about how a method or function works, type the name of the function followed by `?`. For example, if we wanted to see how to configure the hist command, we could do:

In [None]:
plt.hist?

This brings up help text for this command, in a split window in the IPython Notebook.  After you read the help, you can minimize the help window by dragging the divider down to the bottom of the Notebook window.

Another interactive feature I use all the time is the `dir()` function. With no argument, it will tell you about all of the Python objects you have access to in your **namespace**. Or, if you pass in a Python object, it will tell you all of the attributes of that object. Give it a try:

In [None]:
dir()

In [None]:
dir(x)

In [None]:
x.max()

## 3.10 Useful Notebook shortcuts

For more shortcuts see `Help > Keyboard Shortcuts` in the file menu at the top of this page

### Command mode
- `00` -- Restart the kernel
- `<shift> + m` -- Merge the contents of a cell with the cell below it
- `a` -- Create a new blank cell above this one
- `b` -- Create a new blank cell below this one
- `dd` -- Delete this cell
- `y` -- Convert cell to code
- `m` -- Convert cell to markdown

## 3.11 Python Fundamentals

### 3.11.1 What is a Program?

As Allen Downey explains in _Think Python_ the main elements of a program are:

- **input**: Get data from the keyboard, a ﬁle, or some other device.
- **output**: Display data on the screen or send data to a ﬁle or other device.
- **math**: Perform basic mathematical operations like addition and multiplication.
- **conditional execution**: Check for certain conditions and execute the appropriate code.
- **repetition**: Perform some action repeatedly, usually with some variation.

These are common steps that you will find to be a generic recipe for many programs, whether written in Python or any other language.

###  3.11.2  Basic Data Types

Data in Python is interpreted as having a **type**.  In low-level, compiled languages like C or C++, the programmer has to explicitly declare the type of each variable before actually using it.  In Python, the type is inferred at run time, and you can always ask Python what the type of an object is:

In [None]:
a = 13
type(a)

In [None]:
a = a * 1.1
type(a)

In [None]:
a = 'Hello World!'
type(a)

Notice that when we multiply `a`, which was initially an integer, by a floating point (decimal number) the result is **cast** as a float.  This is like the integer divide problem earlier -- using a floating point number in the calculation causes the result of the calculation to become a floating point number.

Notice also that we can reassign any value or type to a variable.  We began with `a` being an integer, then changed its value to a float, and then to a string (text).  Variables are dynamically updated in this way based on values assigned to them. That's what people mean when they say Python is a _dynamically typed_ language, instead of a _statically typed_ language like C++.

### 3.11.3 Lists
In Python, you can also make lists of numbers. A Python **list** is enclosed in square brackets. Items inside the list are separated by commas.

In [None]:
# a list
[7.0, 6.24, 9.98, 4]

Lists can be stored as variables, which is handy for when you want to want to save a set of items without writing them out over and over again.

In [None]:
my_list = [4, 8, 15, 16, 23, 42]
my_list

### 3.11.4 Variables

Variables are named objects that we use to store a value. They can be of any type: 

In [None]:
city = 'San Francisco'
print(city, 'is a ', type(city))

In [None]:
x = 345
print(x, 'is a ', type(x))

In [None]:
y = 2.324
print(y, 'is a ', type(y))

You can use a lot of names for a variable, but there are exceptions (another word for error!).  Some rules apply.  You can't use Python reserved words, or start with a number, or use nonstandard characters like a copyright symbol.  You'll get an **exception** if you do:

In [None]:
2x = 24

And here are the 31 keywords reserved by Python (in version 2), that are ineligible for use as variable names:

`and`, `as`, `assert`, `break`, `class`, `continue`, `def`, `del`, `elif`, `else`, `except`, `exec`, `finally`, `for`, `from`, `global`, `if`, `import`, `in`, `is`, `lambda`, `not`, `or`, `pass`, `print`, `raise`, `return`, `try`, `while`, `with`, `yield`.

### 3.11.5 Operators and Equations

Operators are symbols used to indicate different operations, mostly mathematical, but some operate on strings also. Many basic arithmetic operations are built into Python, like:
- `+` -- addition
- `-` -- subtraction
- `*` -- multiplication
- `/` -- division
- `**` or `^` -- exponentiation
- `%` -- modulo

There are many others, which you can find information about [here](http://www.inferentialthinking.com/chapters/03/1/expressions.html). 


The computer evaluates arithmetic according to the PEMDAS order of operations (just like you probably learned in middle school): anything in parentheses is done first, followed by exponents, then multiplication and division, and finally addition and subtraction.
Some basic operations:

In [None]:
5 * 5

In [None]:
x = 5
x = x / 2.1
print (x)

In [None]:
y = x ** 2
print (y)

NOTE: Proper Python style requires using whitespace (" ") on either side of an operator in an expression

Some of these operators also work on strings, but the behavior is different:

In [None]:
city = 'San Francisco'
sep = ', '
state = 'California'
location = city + sep + state
print (location)

In [None]:
city * 4

### 3.11.6 Expressions

**Expressions** are combinations of values, variables, and operators, like most of the lines of code that we've just seen.


In [None]:
# an example of expression
14 + 20

When you run the cell, the computer evaluates the expression and prints the result. Note that only the last line in a code cell will be printed, unless you explicitly tell the computer you want to print the result.

In [None]:
# more expressions. what gets printed and what doesn't?
100 / 10

print(4.3 + 10.98)

33 - 9 * (40000 + 1)

884

You can also assign names to expressions. The computer will compute the expression and assign the name to the result of the computation.

In [None]:
y = 50 * 2 + 1
y

We can then use these names as if they were whatever they stand for (in this case, numbers).

In [None]:
x - 42

In [None]:
x + y

In [None]:
# before you run this cell, can you say what it should print?
4 - 2 * (1 + 6 / 3)

### 3.11.7 Statements
**Statements** often include expressions, but unlike expressions they do not always have a value (e.g. print statement), and represent a line of code that Python can execute.


### 3.11.8 Scripts
A **script** is a text file which stores a bunch of Python statements to be executed in sequential order. Python scripts will normally use the ".py" file extension.

Jupyter makes it easy for you to convert a .ipynb notebook to a .py script from the file menu: `File > Download as > Python (.py)`. Python scripts can be run at the command line by typing `python <filename>` where `<filename>` is name of the .py script you want to run. 

### 3.11.9 Comments
A good thing to add to code, both in notebooks and scripts, to remind yourself of its intended use or to document it for someone else who may want to run it in the future. Use liberally! They won't slow anything down.

In [None]:
# This is a comment explaining the code below which, if my code is complex,
# I might not remember in detail later without comments.
# Below I create an array of 10 numbers by adding a random number between
# 0 and 10000 to a base of 5000 and then taking the natural logarithm of the result
income = 50000.0 + 10000 * np.random.randn(10)
y = np.log(income)
y

### 3.12 Before going on... 
...make sure you have Anaconda Python installed and working correctly.

Once you have Anaconda Python installed you should be able to launch a Jupyter Notebook and experiment with creating some cells with Markdown text, some code cells with simple calculations and creating variables, and execute those cells using `<shift> + <enter>` or using the Run icon at the top of the notebook.

# 4. For next time

# 5. Questions?