In [1]:
# Setting up a custom stylesheet in IJulia
file = open("style.css") # A .css file in the same folder as this notebook file
styl = readall(file) # Read the file
HTML("$styl") # Output as HTML

<h1> Ebola and Wikipedia: Loading publicly available data using Julia </h1>

<h2>In this lecture</h2>

- [Outcome](#Outcome)
- [Wikipedia data on the West African EVD epidemic](#Wikipedia-data-on-the-West-African-EVD-epidemic)
- [Using readdlm() to load a .csv file](#Using-readdlm-to-load-a-.csv-file)

[Back to the top](#In-this-lecture)

<h2>Outcome</h2>

After this lecture, you will be able to
- Find data on the West African EVD epidemic online
- Use readdlm() to load data from a .csv file containing this data

[Back to the top](#In-this-lecture)

<h2>Wikipedia data on the West African EVD epidemic</h2>

Wikipedia has many excellent articles on Ebola. We will be using one with fairly complete data on the timeline of cases: https://en.wikipedia.org/wiki/West_African_Ebola_virus_epidemic_timeline_of_reported_cases_and_deaths

Go there now, please, and navigate until you see a table that looks like this:

<img src="Week2_Lecture2_1-Wikipedia-EVD-cases.png" alt="(Screenshot of wikipedia table of WA EVD cases)">


[Back to the top](#In-this-lecture)

We have provided the data as a file named wikipediaEVDraw.csv. The ".csv" extension indicates that it is a comma-separated file, and the "raw" in the filename indicates that the data are imported as is, without any changes.

If you would like to learn how to create .csv files from tables on the web, please go the optional lecture "How to export web tables to .csv files".

[Back to the top](#In-this-lecture)

<h2>Using readdlm to load a .csv file</h2>

Now we can start using Julia again. In a new notebook for you Week 2 Julia code, enter and execute the line below:

In [3]:
wikiEVDraw = readdlm("wikipediaEVDraw.csv", ',')  # getting quotes right is important!

54x9 Array{Any,2}:
 "25 Nov 2015"  28637  11314  3804  2536  …  4808     14122     3955   
 "18 Nov 2015"  28634  11314  3804  2536     4808     14122     3955   
 "11 Nov 2015"  28635  11314  3805  2536     4808     14122     3955   
 "4 Nov 2015"   28607  11314  3810  2536     4808     14089     3955   
 "25 Oct 2015"  28539  11298  3806  2535     4808     14061     3955   
 "18 Oct 2015"  28476  11298  3803  2535  …  4808     14001     3955   
 "11 Oct 2015"  28454  11297  3800  2534     4808     13982     3955   
 "27 Sep 2015"  28388  11296  3805  2533     4808     13911     3955   
 "20 Sep 2015"  28295  11295  3800  2532     4808     13823     3955   
 "13 Sep 2015"  28220  11291  3792  2530     4808     13756     3953   
 "6 Sep 2015"   28147  11291  3792  2530  …  4808     13683     3953   
 "30 Aug 2015"  28073  11290  3792  2529     4808     13609     3953   
 "16 Aug 2015"  27952  11284  3786  2524     4808     13494     3952   
 ⋮                                        ⋱  

The readdlm() function is Julia's way to read any file that consists of lines separated into data items with a delimeter of some sort. In fact, the very word "readdlm" is an abbreviation of "read-with-a-delimiter". 

Notice three things
- We have used a variable to contain the data from the file (you could change the name, though, if you like)
- The file name is given as a string, using double quotes
- The delimeter is given as a character, using single quotes

Finally, we see that the type of the data, after it has been stored in the variable is an array, the elements of which are of Any type. This is not good for computation---in particular, for modelling we need the data in terms of days since the start of the epidemic. Our next job is to convert the strings in columnn one into integers which give number of days since 22 March 2014.

[Back to the top](#In-this-lecture)