Urban Data Science & Smart Cities <br>
URSP688Y <br>
Instructor: Chester Harvey <br>
Urban Studies & Planning <br>
National Center for Smart Growth <br>
University of Maryland

[<img src="https://colab.research.google.com/assets/colab-badge.svg"> Clean version](https://colab.research.google.com/github/ncsg/ursp688y_sp2024/blob/main/demos/demo02/demo02.ipynb)

[<img src="https://colab.research.google.com/assets/colab-badge.svg"> Modified in class](https://colab.research.google.com/drive/1prKx4NcR1mDwX2rvaFclyx0xX7-cVyTg?usp=sharing)

# Demo 2 - More programming fundamentals

- Intro to programming (continued)
    - Basic data types
    - Programming logic
        - Conditions
        - Loops
    - Functions
        - Namespaces
    - Errors and debugging
    - Goodies
        - Conditional expressions
        - List comprehensions
    - Errors and debugging
- Pseudocode

## Intro to Python Programming (continued)

### Basic style guidelines for Python
- At the very least, do *everything* consistently
- One statement per line
- Try to limit line length to 72 characters
- Use four spaces to indent
- Put spaces around operators (e.g., `1 + 1` or `day = 'Monday'`) (except in keyword function arguments)
- Use blank lines intentionally and consistently
- Use meaningful names
- Name variables and functions with `lowercase_underscores`
- Constants are often named in `ALL_CAPS_WITH_UNDERSCORES` (e.g., `C = 2.99792458e+8`)
- Name custom classes with `CapWords`
- In general, avoid spaces in folder and filenames used for programming

See [Code Readability](https://github.com/ncsg/ursp688y_sp2024/blob/main/README.md#code-readability) on the syllabus. [CS61A](https://cs61a.org/articles/composition/) has an excellent composition guide. [PEP 8](https://peps.python.org/pep-0008/) is a standard Python style guide. [Google](https://google.github.io/styleguide/pyguide.html) publishes their internal Python style guide.

### Basic Data Types

#### String
Text. Must be surrounded by either double (") or single (') quotes

In [1]:
name = "Chester"
print(type(name)) # This statement uses the type and print functions to show that example_string is, in fact, a string
print(name) # This statement uses the print function to show the contents of example_string

<class 'str'>
Chester


[F-strings](https://realpython.com/python-f-strings/) are a super handy syntax for building strings with variables.

In [2]:
f'My name is {name}'

'My name is Chester'

#### Integer
Number without decimal places

In [3]:
age = 25
print(type(age))
print(age)

<class 'int'>
25


#### Float
Number with decimal places

In [4]:
height = 5.95
print(type(height))
print(height)

<class 'float'>
5.95


#### Boolean
True or False

In [5]:
private_jet = False
print(type(private_jet))
print(private_jet)

<class 'bool'>
False


### Composite Data Types

#### List
An ordered array of objects.

In [6]:
fridge_contents = ['milk','apple','celery','yogurt']
print(type(fridge_contents))
print(fridge_contents)

<class 'list'>
['milk', 'apple', 'celery', 'yogurt']


In [7]:
# You can add lists together
fridge_contents = fridge_contents + ['orange juice', 'leftovers']
fridge_contents

['milk', 'apple', 'celery', 'yogurt', 'orange juice', 'leftovers']

In [8]:
# Or append elements to a list
fridge_contents.append('cheese')
fridge_contents

['milk', 'apple', 'celery', 'yogurt', 'orange juice', 'leftovers', 'cheese']

In [9]:
# Or remove things
fridge_contents.remove('yogurt')
fridge_contents

['milk', 'apple', 'celery', 'orange juice', 'leftovers', 'cheese']

In [10]:
# You can look things up in a list by index number, starting with 0
fridge_contents[0]

'milk'

In [11]:
# Or get just a part of a list with "indexing"
fridge_contents[:2]

['milk', 'apple']

#### Dictionary

Labeled data stored as key-value pairs.

*Note*: Dictionaries used to be unordered, but as of Python 3.6 they technically maintain their order. Lists are still usually preferred when order matters. There's also something called an [ordered dictionary](https://realpython.com/python-ordereddict/), which makes it more explicit that you care about order and can make it easier to manage/change order.

In [12]:
goodness_at_sports = {
    'basketball': 2,
    'baseball': 1,
    'skiing': 8,
    'volleyball': 3,
}
print(type(goodness_at_sports))
print(goodness_at_sports)

<class 'dict'>
{'basketball': 2, 'baseball': 1, 'skiing': 8, 'volleyball': 3}


In [13]:
# You can add an entry to a dictionary
goodness_at_sports['cornhole'] = 3

In [14]:
# And remove one
goodness_at_sports.pop('baseball')

1

In [15]:
# And look up values based on keys
goodness_at_sports['skiing']

8

### Programming logic

Now that we've got basic building blocks, we can *do* things with them.

This requires programming logic: using logical statements to control the flow of our code in productive ways.

#### [Conditions](https://realpython.com/python-conditional-statements/)

In [16]:
age = 10
if age < 18:
    print('child')
else:
    print('adult')

# Can we add a third condition for teenager?

child


#### Loops

Python has both [`for` loops](https://realpython.com/python-for-loop/) and [`while` loops](https://realpython.com/python-while-loop/).

We're going to focus on `for` loops because they're most commonly used in data science. While loops are particularly handy in applications that respond to dynamic inputs from users or data streams.

In [17]:
ages = [5, 10, 65, 81, 45]

for age in ages:
    if age < 18:
        print('child')
    else:
        print('adult')

child
child
adult
adult
adult


In [18]:
people = {
    'Daniela': 5, 
    'Zoe': 10,
    'Rowen': 65,
    'Jude': 81,
    'Austin': 45,
}

for name, age in people.items():
    if age < 18:
        age_desc = 'a child'
    else:
        age_desc = 'an adult'
    print(f'{name} is {age_desc}')

Daniela is a child
Zoe is a child
Rowen is an adult
Jude is an adult
Austin is an adult


### Functions

Functions are integral to efficient programming, and core to a paradigm called ["functional programming."](https://realpython.com/python-functional-programming/).

Functions break up code into blocks that can be written in a general way, tested to ensure they do what you want them to, and reused over and over again by you and others. Like loops, they save having to write the same code repetitively. They also help keep your code organized, making it easier to locate and fix bugs. Eventually, you will start combining functions stored in lots of different places, allowing you to build on others' work.

In fact, you have already used functions. "Built-in" functions like `print` and `len` were written by someone long ago and stored deep in the bowels of Python's source code. All you need to do to use them is call them by their names and input information. Then they do some processing and spit out a result. 

<img src="https://miro.medium.com/v2/resize:fit:880/0*xMEO8AbXwdsgnHSH.png" alt="Diagram of a function with input and output" width="400"/>

The inputs to a function are called "arguments." Functions can have no arguments or infinite arguments, but let's stick with one or two to demonstrate how they usually work.

Let's first write a function, then use it. Using a function is called "calling" it.

In [19]:
# The statement to define a function starts with 'def', then a space, then the function name, then parentheses listing the arguments
def label_age(age): 
    if age < 18:
        label = 'child'
    else:
        label = 'adult'
    return label

In [20]:
# To call the function, we write its name, then put values for the arguments in parentheses
label_age(20)

'adult'

In [21]:
# We could have multiple arguments.
def label_age(name, age): 
    if age < 18:
        label = 'a child'
    else:
        label = 'an adult'
    return f'{name} is {label}'

In [22]:
# You need to list them in the correct order when you call the function (positional arguments)
label_age('Chester', 20)

'Chester is an adult'

In [23]:
# This won't work...
label_age(20, 'Chester')

TypeError: '<' not supported between instances of 'str' and 'int'

In [24]:
# Or specify them overtly (keyword arguments)
label_age(age=20, name='Chester')

'Chester is an adult'

#### Namespaces

Functions are a good way to understand a somewhat complicated (but ultimately VERY useful) aspect of Python: namespaces.

Namespaces refer to the parts of code where certain variables, _names_, exist and are accessible to other code. Having different namespaces allows the same variable names to be used by multiple parts of the code without having to keep such careful track of what values are assigned to them or whether they are being overwritten. 

Namespaces minimize naming clutter, maximize flexibility, and allow code to be written in ways that are generalizable to lots of applications.

The function we just wrote has two arguments, `name` and `age`, which are variables inside the function. It also defines another variable, `label`, which is usable inside the function. We call these variables that are _local_ to the function. We can see the variables local to a namespace by printing the output of the `locals` function (notice that it doesn't need any arguments).

In [25]:
def label_age(name, age): 
    if age < 18:
        label = 'a child'
    else:
        label = 'an adult'
    
    print(f'Local variables: {locals()}')
    
    return f'{name} is {label}'

In [26]:
label_age('Chester', 20)

Local variables: {'name': 'Chester', 'age': 20, 'label': 'an adult'}


'Chester is an adult'

If we go outside the function, those variables won't necessarily have the same vales assigned to them, and they might not be defined at all.

In [27]:
print(name)
print(age)
print(label)

Austin
45


NameError: name 'label' is not defined

We can also see all the local variables in this namespace; we've built up a lot!

Bonus: what data type does the `locals` function return?

In [28]:
# locals()

#### Something to watch out for with function namespaces:

If you include a previously-defined variable in a function definition and don't name it as an argument, its value in the function will be set by the _outer_ namespace where it was first defined.

This function gets age from the the _global_ namespace, which is a much more complicated and less controlled space to operate in. 

It's best practice to do as much as possible within local namespaces. This means adding all data needed by a function as arguments.

In [29]:
age = 20

def label_age(name):
    if age < 18:
        label = 'a child'
    else:
        label = 'an adult'
    # print(f'Local variables: {locals()}')
    # print('')
    # print(f'Global variables: {globals()}')
    return f'{name} is {label}'
    

In [30]:
label_age(name)

'Austin is an adult'

In [31]:
age = 10
label_age(name)

'Austin is a child'

### Goodies

There is often more than one way to do something in Python.

For simple/common statements, there are often simpler ways to write them. This is part of what it means for syntax to be ["Pythonic."](https://www.udacity.com/blog/2020/09/what-is-pythonic-style.html)

#### Conditional expressions

Instead of writing a basic condition with indented code blocks you can compress it into a single statement.

In [32]:
name = 'Chester'

if len(name) > 8:
    length = 'long'
else:
    length = 'short'

print(length)

# is the same as...

length = 'long' if len(name) > 8 else 'short'

print(length)

short
short


#### List comprehensions

Instead of looping over a list with an indented code block, you can operate on each element of it with a single statement.

Using list comprehensions well will immediately make you look like a pro.

In [33]:
names = ['Daniela', 'Zoe', 'Rowen', 'Jude', 'Austin']

first_letters = []
for name in names:
    first_letters.append(name[0])
print(first_letters)

# is the same as...

first_letters = [name[0] for name in names]
print(first_letters)

['D', 'Z', 'R', 'J', 'A']
['D', 'Z', 'R', 'J', 'A']


In [34]:
lower_case_names = []
for name in names:
    lower_case_names.append(name.lower())
print(lower_case_names)

# is the same as...

lower_case_names = [name.lower() for name in names]
print(lower_case_names)

['daniela', 'zoe', 'rowen', 'jude', 'austin']
['daniela', 'zoe', 'rowen', 'jude', 'austin']


### Errors and debugging

Errors are frustrating and inevitable. Even professional programmers probably spend most of their time debugging.

Luckily, there are good tools and techniques for making debugging a little easier.

Despite these, you will probably nearly tear your hair out with some frequency, especially as a beginner. It will get better with time.

There are two types of errors in programming: logic and syntax. They both result in your program not achieving its goal, but the first may not be as easily detectable because the code may still run.

#### Logic errors
These are issues with how you have approached or executed your problem. If your code runs but produces nonsensical results, there is probably a logic error. However, your erroneous code might also produce logical but *wrong* results; you might never notice until the problem has rippled downstream. It's best to address this proactively by planning your code well so it's less likely to be illogical, and writing readable code that can be easily reviewed.

Here's a logic error. Can you find it? (Hint: the issue is syntactical, but it's still a logic error because the code works without throwing an error.)

In [35]:
for name, age in people.items():
    if age < 18:
        age_desc = 'a child'
    else:
        age_des = 'an adult'
    print(f'{name} is {age_desc}')

Daniela is a child
Zoe is a child
Rowen is a child
Jude is a child
Austin is a child


#### Syntax errors
These are more obvious because your code will simply fail. There are lots of tools for figuring out where and why.

Error messages are usually the starting place for debugging a syntax error.

In [36]:
for name, age in people:
    if agete < 18:
        age_desc = 'a child'
    else:
        age_desc = 'an adult'
    print(f'{name} is {age_desc}')

ValueError: too many values to unpack (expected 2)

The error message tells us where the problem is located.

Sometimes, it can be helpful to turn on line numbers.
- In Colab: `Tools -> Settings -> Editor -> Show line numbers`
- In JupyterLab: `View -> Show Line Numbers`

The `ValueError` tells us that the issue is related to the value of a variable on this line, but it's still pretty vague.

Time to start [Googling](https://www.google.com/).

You can also get some help with Python functions and objects with the `help` function:

In [37]:
help(len)

Help on built-in function len in module builtins:

len(obj, /)
    Return the number of items in a container.



In [38]:
# help(goodness_at_sports)

## Pseudocode

You can work out programming logic without even writing code.

In fact, it's a great idea to start with pseudocode.

That way you can think big-picture without getting distracted by the intricacies of syntax or availability of pre-existing components.

In [39]:
# Pseudocode for testing whether each person is a child or an adult

# structure the data as a dictionary of people with their ages
# loop through the people
    # if the person's age is < 18:
        # return child
    # otherwise:
        # return adult

Can we write pseudocode for making scrambled eggs?