Urban Data Science & Smart Cities <br>
URSP688Y <br>
Instructor: Chester Harvey <br>
Urban Studies & Planning <br>
National Center for Smart Growth <br>
University of Maryland

[<img src="https://colab.research.google.com/assets/colab-badge.svg"> Clean version](https://colab.research.google.com/github/ncsg/ursp688y_sp2024/blob/main/demos/demo04/demo04.ipynb)

[<img src="https://colab.research.google.com/assets/colab-badge.svg"> Modified in class](https://colab.research.google.com/drive/11bvlfaXuamFZ__Bb97a2IGYom05Mtj5z?usp=sharing) #### UPDATE ####

# Demo 4 - Loading Data

- Using a debugger (and installing packages with `pip`)
- Connecting to Google Drive in Colab
- Loading data from files
    - CSV
    - Excel
    - JSON
- Repository structure
- Introducing the final project

## Using a debugger (and installing packages with `pip`)

Sometimes things just really don't work and it's hard to figure out why. This can be especially true when there is a lot of nesting with names accessible only inside functions or loops.

There are special tools for debugging that can help step through code one line at a time, stop in specific places, and understand the values stored in variables at specific points in the program.

A good way to [implement this in a Jupyter notebook](https://zohaib.me/debugging-in-google-collab-notebook/), including CoLab, is with a package called `ipdb`. Unfortunately, ipdb does not come pre-installed with CoLab, so we'll need to install it before we can import it.

This is as easy as using a special character (`!`) to ask CoLab to run a command as if it was on the computer's command line, not with the Python interpreter.

We're using a program called `pip`, which goes to its internet repositories, downloads ipdb, and installs it.

With CoLab, you need to do this every time you use it because it wipes your virtual computer clean when your session times out.

In [33]:
!pip install -Uqq ipdb # This version will run 'quietly'

# !pip install ipdb # This version will show log outputs

Now that we have ipdb, we can import it and start debugging.

In [34]:
import ipdb

Here's a loop with a logic error (this may look familiar from last week):

In [35]:
people = {'Daniela': 5, 'Rowen': 65, 'Zoe': 10, 'Jude': 81, 'Austin': 45}

for name, age in people.items():
    if age < 18:
        age_desc = 'a child'
    else:
        age_des = 'an adult'
    # ipdb.set_trace() # Here is the breakpoint where we'll inspect
    print(f'{name} is {age_desc}')

Daniela is a child
Rowen is a child
Zoe is a child
Jude is a child
Austin is a child


Let's use ipdb to dig into why every person is being listed as a child rather than an adult.

First, we have to turn the debugger on. We use a shortcut called a 'magic command,' which is only valid within a notebook. It isn't technically Python.

In [36]:
%pdb on

# %pdb off # Turn it off

Automatic pdb calling has been turned ON


### Breakpoints

Add `ipdb.set_trace()` anywhere in program to stop it for inspection.

Then use commands to continue flow as needed.

|Command|Description|
|--- |--- |
|h(elp)|Show various commands supported by ipdb|
|h(elp) COMMAND|Show description of the COMMAND specificed|
|c(ontinue)|Continue executing till it hits another breakpoint|
|n(ext)|Execute till next line in the same code frame. So if there is a function it wouldn't step into that function but execute it.|
|s(tep)|Step to next code, so if its a function, it will step into the function.|
|r(eturn)|Execute code till it returns from the current function or hits another breakpoint.|
|l(ist)|Show more of the source code surrounding the  line.|
|w(here)|Shows the stacktrace i.e. the chain of functions that made it reach the current function|
|a(rguments)|List of arguments passed and its values to the function|
|q(uit)|Immediately stop execution and quit the debugger|

### "Cheap" debugging
- `print`
- `break`
- `pass`
- early returns from functions
- `try` and `except` (lazy evaluation)

## Connecting to Google Drive in Colab

In [2]:
from google.colab import drive
drive.mount('/content/drive')

### Working directory

In [3]:
import os

In [4]:
os.getcwd()

'/Users/cwharvey/github/ursp688y_sp2024_dev/demos/demo04'

In [6]:
!ls

demo04.ipynb


In [7]:
os.chdir('/content/drive/MyDrive/Teaching/URSP688Y Spring 2024/Modified Demo Notebooks')

FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/MyDrive/Teaching/URSP688Y Spring 2024/Modified Demo Notebooks'

## Loading files

## Repository structure

## Introducing the final project