<a href="https://colab.research.google.com/github/andrew66882011/qss20_slides_activities/blob/main/activities/00_latex_output_examples.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
## imports
import pandas as pd
import numpy as np
import plotnine
from plotnine import *
import random

## print multiple things from same cell
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"


# Example code

## Load data

In [None]:
## load data on 2020 crimes in DC
dc_crim_2020 = pd.read_csv("https://opendata.arcgis.com/datasets/f516e0dd7b614b088ad781b0c4002331_2.csv")
dc_crim_2020.head()
dc_crim_2020.shape
dc_crim_2020.info()


## Example of creating a table to export to latex



In [None]:
method_v_offense = pd.crosstab(dc_crim_2020.METHOD, 
                              dc_crim_2020.OFFENSE)
method_v_offense

## method 1- transpose and print a table to console to copy/paste
print(method_v_offense.T.to_latex(index = False, caption = "Types of weapons in offenses",
                                 label = "tab:method_offense"))

## method 2- write .tex to folder and upload to overleaf/reference the tex file directly 
method_v_offense.T.to_latex("methodoffense.tex", 
                            index = False, caption = "Types of weapons in offenses",
                            label = "tab:method_offense_written")


## for method 2, if working with latex locally, can also then
## just reference the filepath directly rather than uploading to overleaf

## Example of creating a figure to export 


In [None]:
## create a fig with the count of crimes by shift 

count_byshift = pd.DataFrame(dc_crim_2020.groupby('SHIFT')['OCTO_RECORD_ID'].nunique()).reset_index()
count_byshift

plot_shifts = (ggplot(count_byshift, aes(x = 'SHIFT', 
                                       y = 'OCTO_RECORD_ID')) +
            geom_bar(stat = "identity", fill = "firebrick") +
            theme_classic() +
            xlab("Which shift?") +
            ylab("Count of crimes") +
            theme(axis_text = element_text(size = 14, color = "black")))
plot_shifts

## method 1- (would avoid)- can left click and save image

## method 2 - write image
plot_shifts.save("plot_shifts.png", 
                width = 12,
                height = 8,
                verbose = False)

# Activity 

1. You decide the table is only informative for crimes where there's variation in the the type of method used. Filter and create a new table that only includes offenses where <80% of the methods == other.

2. Create a table to export to latex with that filtered information. Ideally programmatically rather than manually, create a caption that specifies which offenses are excluded from the table. In LaTeX, write a few bullet points summarizing what the figure shows. Have one bullet point define the fraction using mathematical notation

3. With that filtered set of offenses, create a figure where the x axis is the type of offense and the y axis is the proportion of that offense where a gun is used. Order the x axis from highest to lowest. Export for latex.

4. *Challenge exercise*: an analyst on a different team wants a breakdown of how the workload varies by shift. They want a separate figure for each of the ANCs in ward 8 (ANC starts with 8). Using a loop or function, write a separate bar plot for each ANC and make sure to programatically change the name of the plot filename so you know which is which. 