# Accessing ACLED Data Programmatically by Date_Range and/or Region

ACLED API documentation: https://acleddata.com/acleddatanew/wp-content/uploads/dlm_uploads/2019/01/API-User-Guide2020.pdf


### The notebook is designed to programmatically pull ACLED data without concerns of data caps or taxing the API.

### The API has a standard 500 data point limit but can be expanded to encompass all the data.  This notebook does this by identifying the total data points in ACLED before any queries are done using the Regions portion of the API.  If the notebook user is interested in ACLED globally the total data point count will be used as the limit parameter. This allows notebook users to only perform one pull.

### If the notebook user is interested in a specific region it also uses the region portion of the API to pull the total data point counts for that region.  The limit parameter then becomes the total count for the specified region.  

### The notebook then pushes it into a pandas dataframe and provides the user an option of they want to save it off as a csv.


# Info for Variables
- start_date and end_date
    - The time range you want to search on.  Formatting is YYYY-MM-DD 
- specific_reg (case sensitive)
    - Yes - if you are looking at specific region
    - No - if you aren't
- reg
    - If you selected Yes for spec_reg, this is the specific region you want to pull for. In reg variable you need to input the numeric code associated with the region below. 
        - 1 - Western Africa
        - 2 - Middle Africa
        - 3 - Eastern Africa
        - 4 - Southern Africa
        - 5 - Northern Africa
        - 7 - Southern Asia
        - 9 - South-Eastern Asia
        - 11 - Middle East
        - 12 - Europe
        - 13 - Caucasus and Central Asia
        - 14 - Central America
        - 15 - South America
        - 16 - Caribbean
- csv_create
    - Yes - if you want to save off the data
    - No - if you don't want to save the data but keeping working
- csv_file
    - entire file name making sure it ends with a .csv (ex. Tiffany_test.csv)
    
* Note - Make sure that all variables have quotes around them (example: '12')

# Populate Variables

In [1]:
##Variables
start_date = '2020-03-01'
end_date = '2020-03-14'
specific_reg = 'Yes'
reg = '12'
csv_create = 'Yes'
csv_file = 'Tiffany_test3.csv'

In [2]:
##import libraries and packages
import requests
import pandas as pd
from pandas.io.json import json_normalize 

In [3]:
##access regional counts
URL = "https://api.acleddata.com/region/read?terms=accept"
r = requests.get(url = URL)
data = r.json()
df_reg = json_normalize(data, 'data')

In [4]:
##get total count of events worldwide 
df_reg['count'] = df_reg['event_count'].astype(int)
total = df_reg['count'].sum(axis=0)
##create dictionary for region numeric code and counts of events in each region
dictionary_lim = pd.Series(df_reg.event_count.values, index = df_reg.region).to_dict()


In [5]:
##If doing a global query will limit to total global count
##If doing a regional query will limit to total regional count
if specific_reg == 'No':
    lim = str(total)
    print(lim)
if specific_reg == 'Yes':
    lim = dictionary_lim[reg] 
    print(lim)

47591


In [6]:
##Building of Parameters  - Should not need to change
URL = 'https://api.acleddata.com/acled/read?terms=accept'
if specific_reg == 'No':
    PARAMS = "&limit="+lim+"&event_date="+start_date+"|"+end_date+"&event_date_where=BETWEEN"
elif specific_reg == 'Yes':
    PARAMS = "&limit="+lim+"&region="+reg+"&event_date="+start_date+"|"+end_date+"&event_date_where=BETWEEN"

In [7]:
##Request event data
r = requests.get(url = URL, params = PARAMS)
data_event = r.json()
df_event = json_normalize(data_event, 'data')
df_event

In [9]:
##option to save data off as a csv or keep working
if csv_create == 'Yes':
    df_event.to_csv(csv_file)
elif csv_create == 'No':
    print('Keep on analyzing your pandas dataframe you crazy data person!')