Datasets wiki page
This page contains for some relevant datasets the name, location, size, encoding, availability of an API, timescale, etc. Also contains links to scripts to read the datasets.
Note that some datasets (including those requiring the NDA) are given on a USB stick.
Both VAM and mVAM collect data.
Info about acronyms used is in the list of acronyms. That page also contains more info about rCSI, FCS, etc.
##mVAM data For mVAM we have aggregated data (data per province per month) and some raw survey data.
This can be found in http://vam.wfp.org/sites/mvam_monitoring/ (click on Databank). But in https://github.com/datamission/WFP/tree/master/Datasets/WFP-mVAM-CSI-FCS-Prices there is an Excel sheet with that data and it has also added visualisations.
Because mVAM is more recent, this contains data from 2015 onwards.
Note that for the survey datasets you need to sign an NDA. You will get it on a USB stick.
We have survey data available for Sierra Leone and for Yemen.
For Yemen there is:
- CFSS data (from a Comprehensive Food Security Survey in 2014), see USB/Yemen/CFSS/. Results are in http://documents.wfp.org/stellent/groups/public/documents/ena/wfp269771.pdf.
- mVAM data from 2015 onwards, see USB/Yemen/mVAM/. See https://github.com/datamission/WFP/tree/master/Datasets/WFP-mVAM-Survey/Yemen for a script to read the Raw mVAM data for Yemen. USB/Yemen/mVAM/YEM_WFP_mVAM_RawData_Dictionnary.xlsx contains info about all columns in this dataset.
For Sierra Leone there is
- CFSVA (a comprehensive food security and vulnerability analysis) questionnaire from 2015. See USB/Sierra Leone/CFSVA/.
- mVAM data from 2015 onwards, see USB/Sierra Leone/mVAM/. See https://github.com/datamission/WFP/tree/master/Datasets/WFP-mVAM-Survey/Sierra-Leone for a script to read the Raw mVAM data for Sierra Leone. USB/Yemen/mVAM/YEM_WFP_mVAM_RawData_Dictionnary.xlsx contains info about some of the columns in this dataset. Other columns are explained in USB/Sierra Leone/mVAM/Data/mVAM_Ebola_datadictionary.xlsx.
##VAM data:
The VAM website contains info per country from 2009 onwards. Luckily the info is aggregated to three big files, viz.:
###Coping Strategy Index Contains info on the CSI per country per month, sometimes per province. Note usually only data on one month for each country. See here for info about CSI.
- location: https://data.hdx.rwlabs.org/dataset/coping-strategy-index-csi
- size: less than 1k lines. Coping Strategy Index per month per province.
- type: CSV
- encoding: ANSI
- timescale: from 2007 until 2013 (depending on country)
- Location of scripts: https://github.com/datamission/WFP/tree/master/Datasets/WFP-Coping-Strategy-Index
###Food Consumption Score Contains info on the FCS per country per month, sometimes per province. Note usually only data on one month for each country. See here for info about FCS.
- location: https://data.hdx.rwlabs.org/dataset/food-consumption-score-fcs
- size: less than 1k lines. Food Consumption Score per month per province.
- type: CSV
- encoding: ANSI
- timescale: from 2009 until 2015 (depending on country)
- Columns requiring further explanation: None
- Location of scripts: https://github.com/datamission/WFP/tree/master/Datasets/WFP-Food-Consumption-Score
Not a very important dataset, contains info on who works in which sector in a country. See https://github.com/datamission/WFP/tree/master/Datasets/WFP-Income-Activies for a bit more info.
##Food prices
WFP gather lots of data about the market prices in different markets:
- location: http://vam.wfp.org/sites/data/WFPVAM_FoodPrices.csv
- size: over 600k lines. Food price per month per market per category
- type: CSV
- encoding: UTF-8, no BOM
- timescale: from 2005 until 2016 (depending on market, and category)
- Extra info: note that sometimes you have different sources of prices for a single country
- Columns requiring further explanation: None
- Location of scripts: https://github.com/datamission/WFP/tree/master/Datasets/WFP-Food-Prices
Note that this can already be visualised at http://foodprices.vam.wfp.org/Default.aspx. This data is used to issue alert when certain food prices spike. This is doen with ALPS (Alert for Price Spikes), see this technical note for more info.
#Other datasets
Other datasets might be useful, e.g.:
Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) is a 30+ year quasi-global rainfall dataset. The data is collected from satellite imagery and weather stations around the world. See http://chg.geog.ucsb.edu/data/chirps/index.html and http://chg-wiki.geog.ucsb.edu/wiki/CHIRPS_FAQ for more information. See https://github.com/datamission/WFP/tree/master/Datasets/CHIRPS for Python scripts to read it
GDELT gather events around the world, divided into different categories, such as 'protest', 'appeal for humanitarian aid', 'engage in mass expulsion', etc (full list here http://gdeltproject.org/data/lookups/CAMEO.eventcodes.txt)
- location: since obtaining data requires registering to google cloud first and querying in SQL, we gathered monthly data for Sierra Leone in long form https://github.com/datamission/WFP/tree/master/Datasets/GDELT/raw_data and in pivot tables here https://github.com/datamission/WFP/tree/master/Datasets/GDELT/clean_data
- size: long form tables: 5217 rows (each row contains month, event code, numberof events), row table: 221 lines (each row contains event code, event description, and event counts for each month from 2010 to 2016)
- type: CSV
- encoding: UTF-8, no BOM
- timescale: from 2010 until now
- Columns requiring further explanation: None
- Location of scripts: https://github.com/datamission/WFP/tree/master/Datasets/GDELT
FAO gathers a large number of datasets on food, agriculture, and trade. http://faostat3.fao.org/download/D/FS/E
We suggest to focus on the Food security database and the Livestock database.
- location: CSV can be downloaded from http://faostat3.fao.org/download/D/FS/E and http://faostat3.fao.org/download/Q/QL/E
- size: ~1Mb. each row contains country, item, year, and value. description of columns here https://github.com/datamission/WFP/tree/master/Datasets/FAO/
- type: CSV
- encoding: UTF-8, no BOM
- timescale: depending on dataset, from 1960 to 2015
- Columns requiring further explanation: None
- Location of scripts: https://github.com/datamission/WFP/blob/master/Datasets/FAO/FAO.ipynb
Contours of administrative boundaries are available here
Yemen: http://geonode.wfp.org/layers/geonode%3Ayem_bnd_adm2
Sierra Leone: http://geonode.wfp.org/layers/?limit=10&offset=0&title__icontains=Sierra%20Leone,%20Administrative%20Boundaries,%20December%202014 (at the moment this link from WFP is not working, can be replaced with https://www.arcgis.com/home/item.html?id=9c333de1a58041319daecdaf16f7392f but I need to check if they are the same subdivisions)
A very easy tutorial for plotting shapefiles within python is http://basemaptutorial.readthedocs.io/en/latest/shapefile.html (you'll need the mpl-toolkits module)
Likewise, in R: http://www.r-bloggers.com/shapefiles-in-r/
IATI is a standard in which development agencies report their progress. Lots of data is available, ask an organizer for more info!
See http://www.data4food.org/databases.html for links to more open datasets.