# JupyterLab: not just a new pretty front-end for Jupyer Notebooks

## File Browser

Though it is pretty pretty.

<- Look at that file explorer

Just double click a file to run it, or right click for more options

And it's super fast to start a new Notebook, just click + in top left of the File Browser.

Look at that snazzy new Launcher with R Notebooks out of the box also!

Let's celebrate by downloading a gif, moving it to the right folder, and embedding it into this notebook without having to leave this tab!

In [1]:
import urllib.request
urllib.request.urlretrieve("https://media1.tenor.com/images/8f285c20303d91dd99827f191c2f639c/tenor.gif", "data-cat.gif")

('data-cat.gif', <http.client.HTTPMessage at 0x7fbd0ecc2c88>)

Ah, we forgot to download it to the `../data/jupyter_lab_files` directory, let's move it... by dragging and dropping. Now display!

<img src='../data/jupyter_lab_files/data-cat.gif'>

### My Gif!

Hey, waitasec, I also meant to have the gif in this section... Can you believe that files aren't the only thing you can drag and drop, but cells too! You can also hold shift to select multiple cells.

## CSVs...fast

Let's download a 3.7 million row csv of all NYC restaurants and their health inspections. (Be aware, takes up 135 MB of space.) If you don't want to, you can open the 75,000 row backup.csv in the `jupyter_lab_files` folder.

In [4]:
I_WANT_TO_DOWNLOAD = False

In [5]:
if I_WANT_TO_DOWNLOAD:
    urllib.request.urlretrieve("https://data.cityofnewyork.us/api/views/xx67-kt59/rows.csv", "../data/jupyter_lab_files/csv.csv")

In [6]:
import pandas as pd
try:
    df = pd.read_csv("../data/jupyter_lab_files/csv.csv")
except:
    df = pd.read_csv("../data/jupyter_lab_files/backup.csv")

In [7]:
df.head()

Unnamed: 0,CAMIS,DBA,BORO,BUILDING,STREET,ZIPCODE,PHONE,CUISINE DESCRIPTION,INSPECTION DATE,ACTION,VIOLATION CODE,VIOLATION DESCRIPTION,CRITICAL FLAG,SCORE,GRADE,GRADE DATE,RECORD DATE,INSPECTION TYPE
0,50034221,AA CHINESE RESTAURANT,BRONX,214,E BURNSIDE AVE,10457.0,7182992218,Chinese,10/31/2017,Violations were cited in the following area(s).,04N,Filth flies or food/refuse/sewage-associated (...,Critical,12.0,A,10/31/2017,06/05/2018,Cycle Inspection / Initial Inspection
1,50046117,ASIAN GARDEN L & Y,QUEENS,8417,JAMAICA AVE,11421.0,7188053636,Chinese,10/24/2016,Violations were cited in the following area(s).,10F,Non-food contact surface improperly constructe...,Not Critical,7.0,A,10/24/2016,06/05/2018,Cycle Inspection / Re-inspection
2,41278885,SAKURA JAPANESE RESTAURANT,BROOKLYN,3118,AVENUE U,11229.0,7186460666,Japanese,05/30/2017,Violations were cited in the following area(s).,06D,"Food contact surface not properly washed, rins...",Critical,14.0,,,06/05/2018,Cycle Inspection / Initial Inspection
3,50056263,DUZER'S LOCAL CAFE + MARKET,STATEN ISLAND,387,VAN DUZER ST,10304.0,9176404867,CafÃ©/Coffee/Tea,03/07/2017,Violations were cited in the following area(s).,06D,"Food contact surface not properly washed, rins...",Critical,20.0,,,06/05/2018,Cycle Inspection / Initial Inspection
4,41025191,MCDONALD'S,BRONX,1540,WESTCHESTER AVENUE,10472.0,7188422981,Hamburgers,04/14/2016,Violations were cited in the following area(s).,06C,Food not protected from potential source of co...,Critical,25.0,,,06/05/2018,Cycle Inspection / Initial Inspection


Ugh, too many rows taking up space. Let's collapse that window by clicking the blue bar to the left when you click on a cell.

## Running python files interactively

Look inside `../data/jupyter_lab_files`

Looks like someone wrote a python file which reads in the above csv. It's not in a notebook, but we can still step through it interactively to see what the results are at each point. We can also edit parts of it and re-run just the lines we want.

Just open up the python file by double clicking. Highlight individual lines or blocks of code then press shift+Enter (or run from `Run` menu).

Add "roach" to the list of keywords. No need to re-run the whole file! Just select everything starting from that line on down and run that. Nice job having to save yourself from re-loading the dataframe from csv!

## Simultaneous tabs in one window!

No more having to scroll through all the dozens of tabs to find the right Notebook window in your browser. Now you can keep them all in one tab and not have to switch browser tabs. Just open up a new Notebook from the Launcher.

Or even open up the same Notebook to copy paste between by right-clicking the tab and choosing "New View for Notebook".

But what if you want to copy and paste from Notebook to notebook or refer to different parts of the same notebook. It's still kinda annoying to flip between tabs... Well now we can do side-by-side windows! Click and drag the tab and drop it on the right hand side of the browser window.

You can even do live editing and previewing of a number of file types. Let's try opening the markdown file in `../data/jupyter_lab_files`. Open up one tab in an editor and the other in preview mode. Set them up side by side and edit the one in the editor. See it change on the right.

### Other editors

Let's go back to the launcher. Just like with Jupyter Notebook, we also can open Terminal and Text Editor. All the more reason you don't need to leave the Jupyter environment as much and save yourself from the friction of finding your place again and context switching...

## Put it all together: New workflows

Let open up a terminal window...

What's the streamshare of Tier 0 artists and how many are there? I remember the table is in the content-creator-data project, but which dataset... You can run `bq ls --project_id=content-creator-data` in terminal or use `!` as in below in Notebook

In [35]:
!bq ls --project_id=content-creator-data

       datasetId        
 ---------------------- 
  album_streams         
  artist_features       
  artist_split_streams  
  artist_streams        
  artist_streams_rank   
  artist_users          
  catalog               
  dataset_stats         
  events                
  genre_mapping         
  hubs                  
  licensor_hierarchy    
  licensor_metadata     
  licensor_streamshare  
  metadata              
  nielsen               
  streams_events        
  track_streams         
  track_users           
  user_artist_streams   
  user_genre            
  user_track_streams    
  wiggle_room           
  wikipedia             


Let's open up a text editor with SQL syntax highlighting... We can develop the query right here in Jupyter without having to leave to go use a separate text editor or the BigQuery GUI. Stay inside the Jupyter environment and do less context switching.

In [11]:
from pandas.io import gbq
from functools import partial

In [12]:
project_I_have_access_to = 'content-creator-data'

In [13]:
bq = partial(gbq.read_gbq, project_id=project_I_have_access_to,verbose=False,dialect='standard')

(Note, `partial` is a nice way to supply common arguments once and re-use the function with those arguments pre-set in the future so you don't have to re-write it again.)

So now that the query is all set, you can paste it below.

In [36]:
tiering_q = """
select sum(artist_split_streamshare) total, count(*)
from `content-creator-data.artist_split_streams.main_artist_split_streamshare_90d_20180601`
where tier=0
"""

In [37]:
bq(tiering_q)

Unnamed: 0,total,f0_
0,0.28622,250


How about a more complex and gnarly query? Let's open up `../data/jupyter_lab_files/example_query.sql`

In [8]:
with open('../data/jupyter_lab_files/example_query.sql') as f:
    query = f.read()

In [14]:
bq(query)

Unnamed: 0,total_releases_in_2015,currently_available_2015_releases,have_US_availability,currently_available_in_US
0,1166908,835773,778026,759566


From your terminal, you can go ahead and push to github if you want...

<img src='../data/jupyter_lab_files/environment.png' />

Stay in the Jupyter environment longer, be a more efficient data scientist!