# Lesson 01 - Introduction

This is a Jupyter notebook. We'll write all of our code in this class in a Jupyter notebook.

Today, don't worry about how any of this works. Throughout the semester, we'll learn how each of these pieces work.

**Notes:** 

 - The maps in this notebook will not load correctly in Safari if you're on a Mac; use Chrome.

 - The cell below installs Google Maps

In [None]:
!pip install -q googlemaps

**Note:** The cell below imports Google Maps, [plotly](https://plotly.com/python/) (an open source graphing library), and other libraries that we'll learn more about throughout the course.

In [None]:
from datascience import *
import numpy as np
import plotly.graph_objects as go
import googlemaps
import matplotlib
import matplotlib.pyplot as plt
import warnings
warnings.simplefilter('ignore', FutureWarning)
warnings.filterwarnings("ignore", message = "Creating an ndarray from ragged")
warnings.filterwarnings("ignore", message = "FixedFormatter should only be used together with FixedLocator")
Table.interactive_plots()

## North Carolina Colleges and Universities

- Introduction to programming with python

- Real-world data

- Basic data visualization

**Example 1.** Here, we'll load in data about all public and private colleges and universities in North Carolina. The data comes from this [Wikipedia article](https://en.wikipedia.org/wiki/List_of_colleges_and_universities_in_North_Carolina).

**Note:** The following examples use commands from the [datasceince](http://data8.org/datascience/) library developed by faculty at UC Berkeley.

In [None]:
nc_col_uni = Table.read_table('data/nc_colleges_and_universties.csv')
nc_col_uni

Data is often stored in tables. Throughout the course, we'll become very, very familiar with how tables work. But for now, let's just observe.

**Example 2.** Show the first 20 rows in the table.

In [None]:
nc_col_uni.show(20)

Let's start asking questions.

**Question 1.** What's the largest public university in North Carolina?

In [None]:
nc_col_uni.where('Control', 'Public').sort('Enrollment (2020)', descending = True)

Let's visualize the 2012 distribution of enrollment.

**Example 3.** Make a bar chart of the 2020 enrollment.

In [None]:
nc_col_uni.where('Control', 'Public').sort('Enrollment (2020)', descending = True).barh('School', 'Enrollment (2020)')

**Question 2.** What's the oldest university in North Carolina?

In [None]:
nc_col_uni.sort('Founded')

**Question 3.** List all the historically black colleges and universtites (HBCUs) in North Carolina?

In [None]:
nc_hbcu = nc_col_uni.where('HBCU', 'Yes')
nc_hbcu.show()

**Example 4.** Let's visualize the total number of universites by year founded.

In [None]:
uni_copy = nc_col_uni.sort('Founded').with_columns('Total Universities', np.arange(1, nc_col_uni.num_rows + 1))
uni_copy.plot('Founded', 'Total Universities')

By using plotly you can add layers 

In [None]:
fig = go.Figure()

fig.add_trace(
    go.Scatter(x = uni_copy.column('Founded'), 
               y = uni_copy.column('Total Universities'), 
               hovertext = uni_copy.column('School'),
               mode = 'markers',
              )
)

fig.add_trace(
    go.Scatter(x = uni_copy.column('Founded'), 
               y = uni_copy.column('Total Universities'),
               line = dict(color = 'blue'),
              )
)

fig.update_layout(title = 'Total Number of Public Universities in North Carolina by Year',
                  xaxis_title = 'Year',
                  yaxis_title = 'Total Universities')

fig.show()

**Question 4.** Where are the colleges and universities in North Carolina located?

In [None]:
nc_col_uni_loc = Table.read_table('./data/nc_colleges_and_universities_locations.csv')
nc_col_uni_loc

In [None]:
Marker.map_table(nc_col_uni_loc, marker_icon = 'info-sign')

It would be nice if this were color-coded based on whether or not the institution was an HBCU. 

We can do that.

In [None]:
hbcu_list = nc_hbcu.to_df()['School'].to_list()
nc_col_uni_loc_sep = nc_col_uni_loc.with_columns(
    'colors', nc_col_uni.apply(lambda s: 'blue' if s in hbcu_list else 'red', 'School')
)

In [None]:
Marker.map_table(nc_col_uni_loc_sep, marker_icon = 'info-sign')