Urban Data Science & Smart Cities <br>
URSP688Y Spring 2025<br>
Instructor: Chester Harvey <br>
Urban Studies & Planning <br>
National Center for Smart Growth <br>
University of Maryland

# Demo 2 - Development Environment & More Programming Fundamentals

- Conda/Jupyter Setup
- Hands-On Intro to Notebooks
- Intro to programming (with Python)
    - Review: variables, operators, and basic data types
    - Composite data types
        - Lists
        - Dictionaries
    - Conditions and loops
    - Errors and debugging

## Conda & Jupyter Setup

We're going to run Python on our computers using Conda, an open-source system for managing Python environments.

We're going to use a distribution of Conda called [Miniconda (scroll down)](https://www.anaconda.com/download/success) that is designed to take up minimal space on your computer. It doesn't come with a bunch of additional junk.

We'll install JupyterLab in a Conda environment and use it to open Jupyter notebooks, where we'll do our programming.

### Initial Setup 

1. Download [Miniconda (scroll down)](https://www.anaconda.com/download/success)
2. Open command line:
    - In Windows: Anaconda Prompt (**not** the regular Windows command prompt)
    - In MacOS: Terminal
3. Make a new environment by typing or copying this prompt:
    - `conda create -n 688y jupyterlab` [Enter]
    - This will make a new environment named `688y` with the latest stable version of `jupyterlab` installed on it
    - You can find complete [documentation for managing conda environments here](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html)
4. Activate your environment by typing or copying this prompt:
    - `conda activate 688y` [Enter]
5. Start JupyterLab by typing or copying this prompt:
    - `jupyter lab` [Enter]
  
### Regular Startup
1. Open command line:
    - In Windows: Anaconda Prompt (**not** the regular Windows command prompt)
    - In MacOS: Terminal
        - Shortcut: In GitHub Desktop, right-click on repo in sidebar and click [Open in Terminal]
- Optional `cd` into the directory you're working in:
    - In Windows:
        - Copy your directory path from Explorer or GitHub Desktop
        - In the Anaconda Prompt, type `cd`, add a space, then paste the path [Enter]
    - In MaOS:
        - Right-click on your directory, press [Option] on keyboard, and click [Copy "..." as Pathname]
        - In the Anaconda Prompt, type `cd`, add a space, then paste the path [Enter]
2. Activate your environment by typing or copying this prompt:
    - `conda activate 688y` [Enter]
3. Start JupyterLab by typing or copying this prompt:
    - `jupyter lab` [Enter]

### Notebooks in JupyterLab

Text cells in [Markdown](https://www.markdownguide.org/basic-syntax/)

In [4]:
# Code cells to execute Python
print('Hello world!')

Hello world!


In [9]:
# Code cells preview the last line——often synonymous with 'print'
name = 'Chester'
name

'Chester'

### Keyboard Shortcuts

To execute a cell:
- [Shift] + [Enter] executes and goes to next cell (cycle through notebook one cell at a time)
- [Ctrl] + [Enter] executes and stays on same cell

There are [***a lot*** more keyboard shortcuts](https://towardsdatascience.com/jypyter-notebook-shortcuts-bf0101a98330).

### Review

#### Variables

- Name-based storage containers
- Can be easily recalled
- Can usually be updated

In [7]:
name = 'Chester'
name

'Chester'

Variables can be recalled in a program.

In [10]:
day = 'Monday'
calendar_text = f'The day of the week is {day}'
calendar_text

'The day of the week is Monday'

#### [Operators](https://www.tutorialspoint.com/python/python_basic_operators.htm)

In [11]:
1 + 1

2

In [12]:
1 < 2

True

#### Basic Data Types

In [24]:
# String (text)
name = 'Chester'
print(name)
print(type(name))

Chester
<class 'str'>


In [21]:
# Integer
year = 2025
print(year)
print(type(year))

<class 'int'>


In [25]:
# Float (decimal)
miles = 6.37
print(miles)
print(type(miles))

6.37
<class 'float'>


In [26]:
# Boolean (true or false)
a_age = 10
b_age = 20
a_older = a_age > b_age
print(a_older)
print(type(a_older))

False
<class 'bool'>


### Composite Data Types

#### List
An ordered array of objects.

In [22]:
fridge_contents = ['milk','apple','celery','yogurt']
print(type(fridge_contents))
print(fridge_contents)

<class 'list'>
['milk', 'apple', 'celery', 'yogurt']


In [23]:
# You can add lists together
fridge_contents = fridge_contents + ['orange juice', 'leftovers']
fridge_contents

['milk', 'apple', 'celery', 'yogurt', 'orange juice', 'leftovers']

In [24]:
# Or append elements to a list
fridge_contents.append('cheese')
fridge_contents

['milk', 'apple', 'celery', 'yogurt', 'orange juice', 'leftovers', 'cheese']

In [25]:
# Or remove things
fridge_contents.remove('yogurt')
fridge_contents

['milk', 'apple', 'celery', 'orange juice', 'leftovers', 'cheese']

In [26]:
# You can look things up in a list by index number, starting with 0
fridge_contents[0]

'milk'

In [27]:
# Or get a slice from a list
fridge_contents[:2]

['milk', 'apple']

#### Dictionary

Labeled data stored as key-value pairs.

*Note*: Dictionaries used to be unordered, but as of Python 3.6 they technically maintain their order. Lists are still usually preferred when order matters. There's also something called an [ordered dictionary](https://realpython.com/python-ordereddict/), which makes it more explicit that you care about order and can make it easier to manage/change order.

In [28]:
goodness_at_sports = {
    'basketball': 2,
    'baseball': 1,
    'skiing': 8,
    'volleyball': 3,
}
print(type(goodness_at_sports))
print(goodness_at_sports)

<class 'dict'>
{'basketball': 2, 'baseball': 1, 'skiing': 8, 'volleyball': 3}


In [29]:
# You can add an element to a dictionary by assigning a new key
goodness_at_sports['cornhole'] = 3

In [30]:
# And remove one
goodness_at_sports.pop('baseball')

1

In [31]:
# And look up a value through its key
goodness_at_sports['skiing']

8

### Programming logic

Now that we've got basic building blocks, we can *do* things with them.

This requires programming logic: using logical statements to control the flow of our code in productive ways.

#### [Conditions](https://realpython.com/python-conditional-statements/)

In [38]:
age = 12
if age < 18:
    adult = False
else:
    adult = True
adult

False

#### Loops

Loops can iterate through composite data, like lists and dictionaries.

'For' loops are the most common type used in data science.

In [31]:
# Looping through a list
ages = [5, 10, 65, 81, 45]

for age in ages:
    if age < 18:
        adult = False
    else:
        adult = True
    print(adult)

False
False
True
True
True


In [32]:
# Looping through key-value pairs in a dictionary using .items()
people = {
    'Daniela': 5, 
    'Zoe': 10,
    'Rowen': 65,
    'Jude': 81,
    'Austin': 45,
}

for name, age in people.items():
    if age < 18:
        adult = False 
    else:
        adult = True
    print(f'{name}: {adult}')

Daniela: False
Zoe: False
Rowen: True
Jude: True
Austin: True


---
We made it this far in class on Week 2

---

### Functions

Functions are pre-defined programming components that do things. Often, they take inputs and produce outputs.

<img src="https://miro.medium.com/v2/resize:fit:880/0*xMEO8AbXwdsgnHSH.png" alt="Diagram of a function with input and output" width="400"/>

- Some basic functions are built-in to Python (e.g., `print`)

- We can write our own custom functions.

- We can use custom functions other people have written.

In [39]:
# Let's write a function that takes an age as input and tells us whether a person is an adult
def check_adult(age):
    if age < 18:
        adult = False
    else:
        adult = True
    return adult

In [41]:
check_adult(20)

True

In [42]:
for age in ages:
    print(check_adult(age))

False
False
True
True
True


### Errors and debugging

Errors are frustrating and inevitable. Even professional programmers probably spend most of their time debugging.

Luckily, there are good tools and techniques for making debugging a little easier.

Despite these, you will probably nearly tear your hair out with some frequency, especially as a beginner. It will get better with time.

There are two types of errors in programming: logic and syntax. They both result in your program not achieving its goal, but the first may not be as easily detectable because the code may still run.

#### Logic errors
These are issues with how you have approached or executed your problem. If your code runs but produces nonsensical results, there is probably a logic error. However, your erroneous code might also produce logical but *wrong* results; you might never notice until the problem has rippled downstream. It's best to address this proactively by planning your code well so it's less likely to be illogical, and writing readable code that can be easily reviewed.

Here's a logic error. Can you find it? (Hint: the issue is syntactical, but it's still a logic error because the code works without throwing an error.)

In [44]:
def check_adult(age):
    if age > 18:
        adult = False
    else:
        adult = True
    return adult

check_adult(20)

False

#### Syntax errors
These are more obvious because your code will simply fail. There are lots of tools for figuring out where and why.

Error messages are usually the starting place for debugging a syntax error.

In [45]:
def check_adult(age):
    if age < 18:
        adult = False
    else:
        adult = True
    return adult

check_adult('20')

TypeError: '<' not supported between instances of 'str' and 'int'

The error message tells us where the problem is located.

Sometimes, it can be helpful to turn on line numbers.
- In Colab: `Tools -> Settings -> Editor -> Show line numbers`
- In JupyterLab: `View -> Show Line Numbers`

The `ValueError` tells us that the issue is related to the value of a variable on this line, but it's still pretty vague.

Time to start [Googling](https://www.google.com/).


### Style guidelines for Python
- At the very least, do things consistently
- One statement per line
- Try to limit line length to 72 characters
- Use four spaces to indent
- Put spaces around operators (e.g., `1 + 1` or `day = 'Monday'`) (except in keyword function arguments)
- Use blank lines intentionally and consistently
- Use meaningful names
- Name variables and functions with `lowercase_underscores`
- Constants are often named in `ALL_CAPS_WITH_UNDERSCORES` (e.g., `C = 2.99792458e+8`)
- Name custom classes with `CapWords`
- In general, avoid spaces in folder and filenames used for programming

See [Code Readability](https://github.com/ncsg/ursp688y_sp2024/blob/main/README.md#code-readability) on the syllabus. [CS61A](https://cs61a.org/articles/composition/) has an excellent composition guide. [PEP 8](https://peps.python.org/pep-0008/) is a standard Python style guide. [Google](https://google.github.io/styleguide/pyguide.html) publishes their internal Python style guide.