# [Global 88] Gender Violence in India

#### Professor: Karenjot Bhangoo Randhawa

**Estimated Time:** *55 minutes*  
**Notebook Created By:** Bella Chang, Emily Guo, Carlos Calderon  
**Code Maintenance:** Carlos Calderon

Welcome! Throughout the course of this notebook you will be using data science techniques and concepts to assess different measures of gender-related violence in India. Data analysis allows us to summarize and understand any trends found in our data. The main purpose of this notebook is to determine any trends in gender-related violence in India found after [the Criminal Law (Amendment) Act 2013](https://www.iitk.ac.in/wc/data/TheCriminalLaw.pdf). Specifically, we'd like to assess whether this legislation had any effect on gender-related violence accross India, and if so, what were those effects?  

**Learning Outcomes**  

By the end of this notebook, students will be able to: 
1. How to work with Jupyter Notebooks and conduct basic data exploration on gender violence on a local case study in India.
2. Learn about basic data literacy, including misuse of statistics, intentional and unintentional.
3. Learn about ethical challenges and making decision about research efficacy and validity.

# Table of Contents

1. [Introduction to Jupyter](#Intro)  
    1.1 [The Code Cell](#code)  
    1.2 [Getting Started](#started)
2. [Introduction to Python](#python)  
    2.1 [Print Statements](#print)  
    2.2 [Variables](#vars)  
    2.3 [Tables](#tables)
3. [Introduction to the Data](#data)  
    3.1 [Background to Data & Data Importing](#background)  
    3.2 [Features of our Data](#feature)
4. [City Analysis](#city)  
    4.1 [Crime Rate Throught Time](#crimerate)  
    4.2 [Population as a Predictor of Crime](#popcrime)  
    4.3 [Mapping Crime Rates](#citymap)
5. [State/Union Territory Analysis](#state)  
    5.1 [Crime Rate Throughout Time](#risingcrime)  
    5.2 [Crime Rate Throughout Time Accross States/UTs](#statettime)   
    5.3 [Relationships](#relationships)   
    5.4 [Mapping Crime Rates](#statemap)  
6. [Data Science Resources](#resources)

<a id='Intro'></a>

<a id="started"></a>

### 1.2 Getting Started
In data science, we are able to manipulate our data using Python *libraries*, which are basically big collections of commands we can use based off of the coding language Python. We have imported some below so that we can easily visualize and analyze our data later on.

In [2]:
# Don't change this cell; just run it. 
import numpy as np
from datascience import *
import pandas as pd

# These lines do some fancy plotting magic
import plotly.express as px 

from ipywidgets import interact
import ipywidgets as widgets
from IPython.display import HTML

import folium
import geopandas as gpd

These libraries differ in the sense that some of them are aimed for visualization, some better suited for statistics, and some are better suited for other purposes! The point is that we want to import all of these different tools for ourselves so that we can have it ready when we are dealing with our data later on.

----

<a id="data"></a>

# 3. Introduction to the Data

<a id="background"></a>

### 3.1 Background to Data & Data Importing  
Our data comes from the [National Crime Records Bureau](https://ncrb.gov.in/en) which was set up in 1986 to "function as a repository of information on crime and criminals so as to assist the investigators in linking crime to the perpetrators." For the purpose of this analysis, we deal with two datasets: [one pertaining to crime in cities in India](https://ncrb.gov.in/sites/default/files/crime_in_india_table_additional_table_chapter_reports/Table%203B.1_3.pdf) and [another pertaining to crime across Indian states/union territories](https://ncrb.gov.in/sites/default/files/crime_in_india_table_additional_table_chapter_reports/Table%203A.1_0.pdf).  

Both of these datasets look at crimes in India between the years of 2017-2019 as well as at specific factors (population, total crime against women) that will be useful in trying to make inferences about gender-based violence in certain areas in India.  
**Note** In the Indian numbering system, a *lakh* is equal to 100,000. Moreover, politically, a *union territory* is a small administrative unit like a state, but while states are self-governed, union territories are directly ruled over by the central or union government. [Feel to read more here](https://www.indiatoday.in/education-today/gk-current-affairs/story/what-is-the-difference-between-a-state-and-an-union-territory-1577445-2019-08-05!)

Here is our **state & union territory based dataset**. Don't worry too much about the specifics of the code, but do observe the table and its headers to familiarize yourself with the format and the significance of each column.

In [3]:
# When we reference this table later, notice that we can just call the variable states!
url = "https://raw.githubusercontent.com/cxrlsc/data/master/global-fa21/State%20Data.csv"
states = Table.read_table(url)
states.show(3)

State/UT,2017 Crimes,2018 Crimes,2019 Crimes,Percentage State Share to Total (2019),Mid-Year Projected Female Population (in lakhs) (2019),Rate of Total Crime Against Women (2019),2018-2019 GDP Per Capita ($),2018-2019 Unemployment Rate (%),2021 Literacy Rate (%)
Andhra Pradesh,17909,16438,17746,4.4,261.4,67.9,2480,7.3,67.02
Arunachal Pradesh,337,368,317,0.1,7.3,43.3,2253,11.1,65.38
Assam,23082,27687,30025,7.4,168.9,177.8,1365,10.7,72.19


Here is our **city based dataset**.

In [4]:
# Again, we can reference this city table later using just the variable cities.
url = "https://raw.githubusercontent.com/cxrlsc/data/master/global-fa21/Table%203B.1_3.csv"
cities = Table.read_table(url)
cities.show(3)

City,2017,2018,2019,Percentage City Share to Total (2019),Actual Population (in Lakhs) (2011),Rate of Total Crime Against Women (2019)+
Ahmedabad (Gujarat),1405,1416,1633,3.6,30.0,54.4
Bengaluru (Karnataka),3565,3427,3486,7.7,40.6,85.9
Chennai (Tamil Nadu),642,761,729,1.6,43.1,16.9


<div class="alert alert-info">
<b> What differences do you see between the two tables (apart from the fact that one is based on states/union territories and one is based on cities)?</b>
</div>

*Replace this text with your response!*

<div class="alert alert-info"> 
    <b> Though the column names are quite similar in this dataset, why might it be helpful to also have city data in our analysis of gender-based violence in particular areas? </b>
</div>

*Replace this text with your response!*

In [5]:
from otter.export import export_notebook
from IPython.display import display, HTML
export_notebook("india_gender_violence.ipynb", filtering=True, pagebreaks=False)
display(HTML("Save this notebook, then click <a href='india_gender_violence.pdf' download>here</a> to open the pdf."))

There was an error generating your LaTeX; showing full error message:
Failed to run "['xelatex', './notebook.tex', '-quiet']" command:
This is XeTeX, Version 3.141592653-2.6-0.999993 (TeX Live 2021) (preloaded format=xelatex)
 restricted \write18 enabled.
entering extended mode
(./notebook.tex
LaTeX2e <2021-11-15>
L3 programming layer <2021-11-22>
(/Users/carloscalderon/Library/TinyTeX/texmf-dist/tex/latex/base/article.cls
Document Class: article 2021/10/04 v1.4n Standard LaTeX document class
(/Users/carloscalderon/Library/TinyTeX/texmf-dist/tex/latex/base/size10.clo))

! LaTeX Error: File `xeCJK.sty' not found.

Type X to quit or <RETURN> to proceed,
or enter new name. (Default extension: sty)

Enter file name: 
! Emergency stop.
<read *> 
         
l.17 ^^M
        
No pages of output.
Transcript written on notebook.log.

If the error above is related to xeCJK or fandol in LaTeX and you don't require this functionality, try running again with no_xecjk set to True or the --no-xecjk fl

<a id="feature"></a>

### 3.2 Features of our Data

Using some of our basic Python knowledge and one of our given datasets (specifically, the **state & union territory dataset**), let's see what baseline facts we can gain.

In [None]:
# First, let's see if we can find the number of rows and columns from our table states.

print(f"The state dataset has {states.num_rows} rows and {states.num_columns} columns")
print(f"The cities dataset has {cities.num_rows} rows and {cities.num_columns} columns")

<div class="alert alert-info"> 
    <b> The output above shows us that the states dataset contains 36 rows and 10 columns whilst the cities dataset contains 19 rows and 7 columns. For each of the datasets, what does each row represent? What does each column represent? </b>
</div>

*Replace this text with your response!*

We can also see which state/union territory had the greatest and smallest rate of total crime against women in 2019. Run the cell below which outputs the row representing the state with the highest rate of total crime against women in 2019.

In [None]:
# Greatest rate of total crime against women in 2019+

# Writing a variable for the column "Rate of Total Crime against Women (2019)+"
total_crime = states.column("Rate of Total Crime Against Women (2019)")

# Using total_crime & .where to see where in the table's column the max of the column is; are.equal_to is a comparison
# predicate to find, within a column, where a value or data point is equal to a certain number.
states.where(total_crime, are.equal_to(max(total_crime)))

In [None]:
# Lowest rate of total crime against women in 2019+

total_crime = states.column("Rate of Total Crime Against Women (2019)")

states.where(total_crime, are.equal_to(min(total_crime)))

<div class="alert alert-info"> 
    <b> The two outputs above show us the states with the maximum and minimum rate of crime against women as measured in 2019. Take some time to do some research on each of these states. What differences/similarities in policies, society, or other factors can potentially contribute for the huge disparity in rates of crime? Compare and contrast the different values in each of the columns.  
       </b>
</div>

*Replace this text with your response!*

We can also create sorted tables that show the rate of total crime against women for each dataset in order from greatest to smalest. Below, we first show the 4 states with the highest rate of total crime against women in 2019. Then, the table below that shows the 4 cities with the highest amount of crime against women in 2019. 

In [None]:
# States/UT table with rate of total crime against women in descending order
descending_crime_states = states.sort("Rate of Total Crime Against Women (2019)", descending = True)
descending_crime_states.show(4)

In [None]:
# Cities table with rate of total crime against women in descending order
descending_crime_cities = cities.sort("Rate of Total Crime Against Women\n(2019)+", descending = True)
descending_crime_cities.show(4)

<div class="alert alert-info"> 
    <b> What differences and/or similarities do you see in the states vs. the cities with the highest rate of crimes against women? How do you interpret the differences? The similarities? </b> 
    </div>

*Replace this text with your response!*

----

<a id="city"></a>

# 4. City-Wide Analysis 

As we have seen, our cities dataset contains information on 19 metropolitan cities. Naturally, one question is how does crime compare accross cities? How has this crime changed in recent years? The following section contains several visualizations of the data. Analyzing and interpreting data visualizations is an integral skill not only in data science, but accross all domains. For most of the following questions, we'd like you to take some time to assess and inquire each graph and answer some questions based on these. 

In [None]:
# Just run this cell
city_arr = ['Ahmedabad', 'Bengaluru', 'Chennai', 'Coimbatore', 'Delhi', 'Ghaziabad', 'Hyderabad',
            'Indore', 'Jaipur', 'Kanpur', 'Kochi', 'Kolkata', 'Kozhikode', 'Lucknow', 'Mumbai',
            'Nagpur', 'Patna', 'Pune', 'Surat']
cities['City'] = city_arr
cities_df = cities.to_df()
cities_melted = pd.melt(cities_df, id_vars=["City"], value_vars=["2017", "2018", "2019"])
cities_melted = cities_melted.rename(columns={"City":"City", "variable":"Year", "value":"Number of Crimes"})
cities.show(5)

<a id="crimerate"></a>

### 4.1 Crime Rate Throughout Time in Metropolitan Cities

A question that naturally arises with time-series data is how do trends change over time? Our dataset contains information from three periods: 2017, 2018, 2019. Given then 2013 legislation, we should expect a decrease in the number of gender-related crimes. But is that really the case? Run the cell below to look at the number of crimes for each state accross time. 

In [None]:
fig = px.line(cities_melted, x="City", y="Number of Crimes", color="Year", height=500, width=1500, 
              title="Number of Crimes Against Women in Cities Accross India (in lakhs) from 2017-2019")
fig.show()

<div class="alert alert-info"> 
    <b> Closely examine the graph above. Hover your mouse over each data point to get more information. Was there any type of social unrest related to gender-based violence in 2019 that would explain a spike in incidence in cities such as Jaipur? On the other hand, how was 2017 any different than 2018/19 in Pune? What other discrepancies do you notice accross cities in different years?</b>
</div>

*Replace this text with your response!*

<a id="popcrime"></a>

### 4.2 Population as a Predictor for Crime 

Another question that we can ask ourselves is, are more populated areas more likely to experience higher crime rates? 

In [None]:
fig = px.scatter(cities_df, y=cities['2019']/cities['Actual Population (in Lakhs) (2011)'],  x="Actual Population (in Lakhs) (2011)",
                 size="2019", hover_data=["City"], height=700, width=700, labels=dict(y="Number of Crimes (per lakh)", size="Number of crimes"),
                 title="Population vs. Number of Crimes Against Women (in Lakhs)")
fig.update_xaxes(title_text="Population")
fig.show()

<div class="alert alert-info"> 
    <b> The graph above plots the relationship between population and number of crimes in a given city. The size of each data point pertains to the amount of crimes against women in that city (i.e.: larger circles represent more crimes against women). What does this graph tell us about highly concentrated urban centres vs. rural communities? Do you notice any trends in the data? That is, are higher populated areas experiencing higher amounts of crime?</b>
</div>

*Replace this text with your answer!*

The graph below is identical to the one above, however the graph below contains a [trend line](https://www.investopedia.com/terms/t/trendline.asp#:~:text=A%20trendline%20is%20a%20line,during%20periods%20of%20price%20contraction.), which allows us to generalize the overall trend in the data. 

In [None]:
fig = px.scatter(cities_df, y=cities['2019']/cities['Actual Population (in Lakhs) (2011)'],  x="Actual Population (in Lakhs) (2011)",
                 size="2019", hover_data=["City"], height=700, width=700, labels=dict(y="Number of Crimes (per lakh)", size="Number of crimes"),
                 title="Population vs. Number of Crimes Against Women (in Lakhs)", trendline="ols")
fig.update_xaxes(title_text="Population")
fig.show()

<div class="alert alert-info"> 
    <b> How were our original assumptions challenged based on the first and second graphs? For example, the second graph shows that crime rates are consistent across (i.e.: do not depend on) population.</b>
</div>

*Replace this text with your answer!*

<a id="citymap"></a>

### 4.3 Mapping Crime Rates Throughout Cities

Another useful visualization is a [chloropleth maps](https://en.wikipedia.org/wiki/Choropleth_map), which allows us to visualize some variable accross a geographic area. Below, we map cities and their respective crime rates. This helps us in visualizing the distribution of crime accross different geographical areas in India. 

In [None]:
geoJSON_url = "https://raw.githubusercontent.com/cxrlsc/data/master/global-fa21/india_city.geojson"
geoJSON_df = gpd.read_file(geoJSON_url)
cities_locs_url = "https://raw.githubusercontent.com/cxrlsc/data/master/global-fa21/Cities%20Location.csv"
cities_locations = pd.read_csv(cities_locs_url)
joined_df = pd.merge(geoJSON_df, cities_df, left_on="NAME_2", right_on="City")

india_longitude = 21.7969
india_latitude = 78.8718

india_map = folium.Map(location=[india_longitude, india_latitude], zoom_start = 4)

folium.Choropleth(
    geo_data=joined_df,
    data=joined_df,
    columns=['City', 'Rate of Total Crime Against Women\n(2019)+'],
    key_on="feature.properties.City",
    fill_color='YlGnBu',
    fill_opacity=1,
    line_opacity=0.2,
    legend_name="total crime (2019)",
    smooth_factor=0,
    line_color = "#0000",
    nan_fill_color = "White"
).add_to(india_map)

for i in np.arange(len(cities_locations)):
    folium.Marker(location=[cities_locations['Latitude (N)'][i], cities_locations['Longitude (E)'][i]], popup=city_arr[i]).add_to(india_map)

    
sw = cities_locations[["Latitude (N)", "Longitude (E)"]].min().tolist()
ne = cities_locations[["Latitude (N)", "Longitude (E)"]].max().tolist()

india_map.fit_bounds([sw, ne])    
india_map

<div class="alert alert-info"> 
    <b> The choloropleth map above shows different Indian cities and their respective rate of crimes against women. Feel free to zoom in and out as much as you want. Are there any regional trends in the data? (i.e.: Are northern cities experiencing more crime than southern cities?) Choose two cities from the map above, conduct some individual research and list any reasons could explain differences in crime rates between these two. (e.g.: Delhi, Ghaziabad) </b>
</div>

*Replace this text with your answer*

----

<a id="state"></a>

# 5. State-Wide Analysis

Shifting gears from city-level analysis to state-level analysis requires careful thought. Trends seen in individual cities may dissapear when looking at a state as whole. Our states dataset contains crime and other information about each state. These additional information contains some features that could give insight on comparisons between features and number of incidents in an area. For the "2018-2019 GDP Per Capita ($)" column, there is no data available for D&N Haveli, Daman & Diu, and Lakshadweep. So we used 2003-2004 GDP per capita for these three union territories instead.

In [None]:
states.show(5)

<a id="risingcrime"></a>

### 5.1 Crime Rate Throughout Time

In [None]:
total_crime_2017 = np.sum(states.column("2017 Crimes"))
total_crime_2018 = np.sum(states.column("2018 Crimes"))
total_crime_2019 = np.sum(states.column("2019 Crimes"))
total_crime_tbl = Table().with_columns("Year", make_array(2017, 2018, 2019), 
                                       "Total Crimes", make_array(total_crime_2017, 
                                                                  total_crime_2018, total_crime_2019))

In [None]:
fig = px.line(total_crime_tbl.to_df(), x="Year", y="Total Crimes", height=550, width=550,
              title="Number of Crimes Against Women in India from 2017-19")
fig.update_xaxes(nticks=3)
fig.show()

<div class="alert alert-info"> 
    <b> The graph above shows us that, accross all states, gender-based crime seems to be rising from 2017 to 2019. When the new legislation was passed in 2013, it became mandated to report gender-based violence. Do you believe crime is rising as a result of increased reports or increased crime?</b>
</div>

*Replace this text with your answer!*

<a id="statettime"></a>

### 5.2 Crime Rate Throughout Time accross States/UTs

In [None]:
states_df = states.to_df()
states_df = states_df.rename(columns={"2017 Crimes":"2017", "2018 Crimes":"2018", "2019 Crimes":"2019"})
states_melted = pd.melt(states_df, id_vars=["State/UT"], value_vars=["2017", "2018", "2019"])
states_melted = states_melted.rename(columns={"variable":"Year", "value":"Number of Crimes"})
fig = px.line(states_melted, x="State/UT", y="Number of Crimes", color="Year", height=500, width=1500, 
              title="Number of Crimes Against Women in States Accross India (in lakhs) from 2017-2019")
fig.update_xaxes(tickangle=-75)
fig.show()

<div class="alert alert-info"> 
    <b>What do we know about states such as Madya Pradesh, Bihar, Assam, Maharashtra, Rajasthan, that could explain differences in crime rates accross time?</b>
</div>

*Replace this text with your answer!*

<a id="relationships"></a>

### 5.3 Relationships Between State Features and Crime Rate

#### Correlation

When working as data scientists, one of the most important measures to understand is [correlation](https://en.wikipedia.org/wiki/Correlation). This concept is used to describe the strength of an association between two variables. More specifically, it refers to how *linearly associated* two variables are. For the purpose of this lesson, you don't need to focus too much on this.  

An important value to understand is the correlation coefficient **r**, which is always a number between -1 and 1. Correlation coefficient values closer to 0 tell us that two variables have no association. On the contrary, correlation coefficient values closer to -1 or 1 tell us that two variables are *highly correlated*, meaning they are both heavily associated with one another. Notice how this does not tell us whether two variables are related by cause an effect. This is because **correlation is not causation**.

For the following section, you will be able to select between two columns (aka: variables) from the `states` dataset and view their correlation coefficient and their scatterplots. Scatterplots are the ideal way to visualize a relationship. [Using the following link as a guide, interpret the correlation coefficients and the scatterplot between two variables for the following questions.](https://www.scribbr.com/statistics/correlation-coefficient/)

In [None]:
def scatter_states(col1, col2):
    fig = px.scatter(states_df, x=col1, y=col2, hover_data=["State/UT"],
                     height=600, width=600, title=col1 + " vs. " + col2)
    
    print(f"Correlation coefficient: {np.corrcoef(states_df[col1], states_df[col2])[0, 1]}")

    fig.show()

lst = states_df.columns.tolist()
    
col1_widget = widgets.Dropdown(options=lst,
                                 value=lst[1],
                                 description="Column 1: ",
                                 disabled=False)

col2_widget = widgets.Dropdown(options=lst,
                                 value=lst[2],
                                 description="Column 2: ",
                                 disabled=False)

interact(scatter_states, col1=col1_widget, col2=col2_widget);

<div class="alert alert-info"> 
    <b>Using the two selection boxes above, choose two columns to visualize their relationship and read their correlation coefficient. What two columns have the highest correlations? What two columns have the lowest correlations? Why do you think this might be?</b>
</div>

*Replace this text with your answer!*

<div class="alert alert-info"> 
    <b>Choose the first column to be "2021 Literacy Rate (%)". With what other column does literacy rate have the highest correlation with? The lowest? Based off the scatter plot and correlation coefficient, how does literacy rate and crime in 2019 relate to one another? </b>
</div>

*Replace this text with your answer!*

<div class="alert alert-info"> 
    <b>Choose the first column to be "2018-2019 Unemployment Rate". With what other column does unemployment have the highest correlation with? The lowest? Based off the scatter plot and correlation coefficient, how does unemployment and crime in 2019 relate to one another?</b>
</div>

*Replace this text with your answer!*

<a id="statemap"></a>

### 5.4 Mapping Crime Rates throughout States/UTs

In [None]:
updated_state_url = "https://raw.githubusercontent.com/cxrlsc/data/master/global-fa21/Updated%20State%20Data.csv"
updated_states = pd.read_csv(updated_state_url)
state_geo_url = "https://raw.githubusercontent.com/cxrlsc/data/master/global-fa21/india_state_geo.json"
geoJSON_df = gpd.read_file(state_geo_url)
joined_df = pd.merge(geoJSON_df, updated_states, left_on="NAME_1", right_on="State/UT")

states_locs_url = "https://raw.githubusercontent.com/cxrlsc/data/master/global-fa21/State%20Locations.csv"
states_locations = pd.read_csv(states_locs_url)

india_longitude = 21.7969
india_latitude = 78.8718

india_map = folium.Map(location=[india_longitude, india_latitude], zoom_start = 4)

folium.Choropleth(
    geo_data=joined_df,
    data=joined_df,
    columns=['State/UT', "Rate of Total Crime Against Women (2019)"],
    key_on="feature.properties.NAME_1",
    fill_color='YlGnBu',
    fill_opacity=0.8,
    line_opacity=0.2,
    legend_name="total crime (2019)",
    smooth_factor=0,
    line_color = "#0000",
    nan_fill_color = "White"
).add_to(india_map)

states_arr = ['Andaman and Nicobar', 'Andhra Pradesh', 'Arunachal Pradesh', 'Assam', 'Bihar',
                             'Chandigarh', 'Chhattisgarh', 'Daman and Diu', 'Delhi', 'Goa', 'Gujarat', 'Haryana',
                             'Himachal Pradesh', 'Jammu and Kashmir', 'Jharkhand', 'Karnataka', 'Kerala', 'Lakshadweep',
                             'Madhya Pradesh', 'Maharashtra', 'Manipur', 'Meghalaya', 'Mizoram', 'Nagaland', 'Orissa',
                             'Puducherry', 'Punjab', 'Rajasthan', 'Sikkim', 'Tamil Nadu', 'Tripura', 'Uttar Pradesh',
                             'Uttaranchal', 'West Bengal']

for i in np.arange(len(states_locations)):
    folium.Marker(location=[states_locations['Latitude (N)'][i], states_locations['Longitude (E)'][i]], popup=states_arr[i]).add_to(india_map)
    
sw = states_locations[["Latitude (N)", "Longitude (E)"]].min().tolist()
ne = states_locations[["Latitude (N)", "Longitude (E)"]].max().tolist()

india_map.fit_bounds([sw, ne])
india_map

<div class="alert alert-info"> 
    <b> The chloropleth map above shows a map of Indian states and their amounts of crime against women. Looking at the legend above, what state has the highest rate of crime against women in 2019? The lowest? What similarities and/or differences do you see in the regional distribution of crime? That is, are there certain areas with higher amounts of crime than others? What possible factors may contribute to this? </b>
</div>

*Replace this text with your answer!*

### Conclusion

Congratulations! You've reached the end of the assignment. It may be the case for some of you that this is the first time being introduced to progamming and data science methods. We hope you gained some insight on how to analyze and interpret crime rate data, particularly how to think about these data under sociopolitical contexts. 

### Submitting Your Work 

In [None]:
# This may take a few seconds 
from IPython.display import display, HTML
!pip install -U notebook-as-pdf -q
!jupyter-nbconvert --to PDFviaHTML india_gender_violence.ipynb
display(HTML("Save this notebook, then click <a href='india_gender_violence.pdf' download>here</a> to open the pdf."))

<a id="resources"></a>

## 6. Data Science Resources at UC Berkeley 

If any assistance is required for this notebook, our peer advisors are here to help! Their drop-in hours are [here](https://data.berkeley.edu/ds-peer-consulting). You can also [email them](ds-peer-consulting@berkeley.edu) to book an appointment if there are any time conflicts.  

If you are interested in data science as a whole, a great course to start with is [Data 8](http://data8.org/fa21/), which is designed for students with no programming or stats experience. For the full list of courses and degree programs, [click here](https://data.berkeley.edu/academics/data-science-undergraduate-studies/data-science-academic-enrichment).

#### Feedback:

Please let us know of your thoughts on this notebook! [Fill out the following survey here.](https://docs.google.com/forms/d/e/1FAIpQLScjlDMT_ddo-yZCTZsm2ZVlK6rrfv5D5KM1fD-B2wp2CS4xgw/viewform) 