(C) Copyright IBM Corp. 2016

## Data Set
In this tutorial you will get acquainted with interactive notebooks by exploring and analyzing historical annual precipitation data.

The raw precipitation data set is from [UNdata](http://data.un.org/), an Internet search engine for statistical databases provided by the United Nations Statistics Division. We will make use of a curated version of this data set in this tutorial. The measurements are in million cubic meter.

1. Download the annual precipitation data in CSV format by clicking on this [link](https://cdsax.cloudant.com/public-samples/test/precipitation.csv).                 
    **Note:** If you use Safari, then right click on the link and choose "Download Linked File". The CSV file will be downloaded to your "Download" folder.              
1. Save the CSV file to your computer.



## Upload a File by Drag and Drop
You can easily add the downloaded CSV file by dragging it from your computer and dropping it onto the drop off area of the **Data Source** panel on the **Palette**. 
Note that the file appears in the **Data Source** panel on the **Palette** and the file is saved in Object Storage.
During the upload of the file, a progress bar indicates the status of the process. Below, you will learn how to access files in Object Storage.
    
<img src="https://cdsax.cloudant.com/public-samples/precipitation_analysis/precipitation_data_source.png" width="20%" height="20%"> 
After adding a data source to your notebook, you can insert the credentials into your code for further access via the **Insert to code** function. If you want to delete files or containers, then use the ** Manage files** function. It takes you to the Object Storage dashboard where you have access to more options for organize your files.

Three important dependancies used in this notebook are:
1. ** Spark-csv ** is a library that is massively used throughout the data science community to automatically infer the schema of your data.
2. ** Scalaj-http **  is a fully featured http client for Scala 
3. **Spark-kernel-brunel** is a visualization framework which you are going to be using to visualize
your data

### Step 1: Import dependancies, jars and packages into our notebook.
Click on the code cell below, then click the right arrow button (**&#9658;**) in the notebook **Toolbar** to run the code.

In [8]:
/* Update: As of [07.12.2016], the spark-kernel-brunel jar, spark-csv and scalaj-http dependencies 
are loaded by default in the Spark Bluemix Service */
    //Import Dependancies
    //%AddDeps com.databricks spark-csv_2.10 1.4.0 --transitive
    //%AddDeps org.scalaj scalaj-http_2.10 1.1.4 --transitive
    //Import Jar 
    //%AddJar -magic http://www.brunelvis.org/jar/spark-kernel-brunel-all-1.2.jar -f

//Import Packages
import scala.collection.breakOut
import org.apache.spark.sql.SQLContext
import org.apache.spark.rdd.PairRDDFunctions
import org.apache.spark.sql.functions._
import org.apache.spark.sql._

You can print your spark kernel configuration to ensure version compliance

In [2]:
println("Spark Context Version: " + sc.version);
println("Java version: " + scala.util.Properties.javaVersion);
println("Scala Release version: " + scala.util.Properties.releaseVersion);
println("_____________________");
println("Spark Context Config:");
sc.getConf.getAll.foreach(println);
println("")

Spark Context Version: 1.6.0
Java version: 1.8.0
Scala Release version: Some(2.10.5)
_____________________
Spark Context Config:
(spark.eventLog.enabled,true)
(spark.deploy.resourceScheduler.factory,org.apache.spark.deploy.master.EGOResourceSchedulerFactory)
(spark.ui.retainedJobs,0)
(spark.r.command,/usr/local/src/bluemix_jupyter_bundle.v8/R/bin/Rscript)
(spark.shuffle.service.enabled,true)
(spark.executor.extraJavaOptions,-Djava.security.egd=file:/dev/./urandom)
(spark.executor.id,driver)
(spark.port.maxRetries,512)
(spark.sql.tungsten.enabled,false)
(spark.repl.class.uri,http://10.143.133.52:40817)
(spark.logConf,true)
(spark.app.id,app-20160706132123-0270-037d6b55-d1db-4077-aa15-9e4c1612d83f)
(spark.externalBlockStore.folderName,spark-ec312195-b75a-46d5-93e6-a8cdbfc28ab5)
(spark.app.name,IBM Spark Kernel)
(spark.ui.enabled,false)
(spark.task.maxFailures,10)
(spark.master,spark://yp-spark-dal09-env5-0028:7082)
(spark.driver.port,58684)
(spark.sql.unsafe.enabled,false)
(spark.ui.reta

### Step 2: Access Object Storage

Because the CSV file is located in Object Storage, we need a helper function to access the file using
provided credentials. Run the cell below to define the method `setConfig()`.

In [3]:
/*found at http://stackoverflow.com/questions/33725500/load-data-from-bluemix-object-store-in-spark*/

def setConfig(name:String, dsConfiguration:String) : Unit = {
    val pfx = "fs.swift.service." + name
    val settings:Map[String,String] = dsConfiguration.split("\\n").
        map(l=>(l.split(":",2)(0).trim(), l.split(":",2)(1).trim()))(breakOut)

    val conf = sc.getConf
    conf.set(pfx + "auth.url", settings.getOrElse("auth_url",""))
    conf.set(pfx + "tenant", settings.getOrElse("tenantId", ""))
    conf.set(pfx + "username", settings.getOrElse("username", ""))
    conf.set(pfx + "password", settings.getOrElse("password", ""))
    conf.set(pfx + "apikey", settings.getOrElse("password", ""))
    conf.set(pfx + "auth.endpoint.prefix", "endpoints")
}

### Step 3: Insert data source credentials

Click the cell below and select **Insert to code** of the precipitation.csv file in the **Data Source**
Panel of the **Palette**. The neccesary credentials for accessing precipitation.csv will then be 
pasted into the code cell as a Scala dictionary. With the credentials, you can use the helper function
to load it into a dataframe. **Note**: Adjust the variable name of the inserted Scala 
dictionary according to the code below and run the code cell.

### Step 4: Load data into a `dataframe`

In [5]:
setConfig("spark", credentials_3.toString())
println("done loading credentials")
println("")

/*found at https://github.com/databricks/spark-csv*/
val sqlctx = new SQLContext(sc);
val scplain = sqlctx.sparkContext;
sqlctx.setConf("spark.sql.shuffle.partitions", "10");
import sqlctx.implicits._

/* Parenthesis allows line continuation which helps with readability */
val df = (sqlctx.read
    .format("com.databricks.spark.csv")
    .option("header","true")
    .option("inferschema","true")
    .option("mode","DROPMALFORMED")
    .load("swift://notebooks.spark/precipitation.csv")
);

println("created df")
println("")

done loading credentials
created df


## Explore Data
Now that we have the data in memory, we can explore and manipulate it.

Print the first 5 rows of the data using the `show()` method.  Run the code cell below.

In [6]:
df.show(5)

Each row provides:

* The Country or Area of the measurements
* The annual precipitation for 1990, and for 1995 to 2009

Using the `DataFrame` API, we can list all countries or areas, just run the cell below.

In [7]:
df.select("Country or Area").rdd.collect()

Array([Albania], [Algeria], [Andorra], [Anguilla], [Antigua and Barbuda], [Armenia], [Azerbaijan], [Bahrain], [Barbados], [Belarus], [Belgium], [Belize], [Benin], [Bermuda], [Bosnia and Herzegovina], [Botswana], [British Virgin Islands], [Brunei Darussalam], [Cameroon], [Central African Republic], [Chile], [China], [China, Hong Kong SAR], [China, Macao SAR], [Colombia], [Cote d'Ivoire], [Croatia], [Cuba], [Cyprus], [Czech Republic], [Denmark], [Dominican Republic], [Ecuador], [Egypt], [Estonia], [Finland], [France], [Gambia], [Georgia], [Germany], [Guinea], [Hungary], [India], [Iraq], [Israel], [Italy], [Jamaica], [Jordan], [Kazakhstan], [Kuwait], [Kyrgyzstan], [Latvia], [Lebanon], [Lithuania], [Luxembourg], [Madagascar], [Maldives], [Malta], [Ma...

### Plot
When working with interactive notebooks, you can decide how to present results and information. So far, we have used normal print functions which are informative. We can also choose a visual way, using Brunel. Brunel takes data in a specific way. It requires a dataframe as an input and could take column values for x and y axes. You can preprocess the data to create a dataframe that holds the value to be visualized on the x and y axes.

The x axis will show the years so you can create an array holding the years.

In [8]:
var years = df.columns.drop(1).map(r=>r.toString)

In [9]:
years

Array(1990, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009)

The y axis will show the precipitation values for Germany so you can create an array to hold these.

In [10]:
var germanyDF = df.where($"Country or Area"==="Germany").drop("Country or Area")
var germany = germanyDF.collect()(0).toSeq.toArray.map(_.asInstanceOf[Double])

In [11]:
germany

We have two arrays that contain the year values and the precipitation amounts in Germany through these years. We index both arrays so we can later join them on these indexes. 

In [12]:
val germanyIndexed = germany.zipWithIndex.map(_.swap)
val yearsIndexed = years.zipWithIndex.map(_.swap)

The PairRDDFunctions class provide extra functions that can be performed on RDDs such as joins.
Therefore we create a PairRDDFunction object. 

In [13]:
val precipInGermany = new PairRDDFunctions(sc.parallelize(germanyIndexed))
var yearsRDD = sc.parallelize(yearsIndexed)

We perform the joining on the indexes that we previously created and create a dataframe out of the new precipitation-year pair. 

In [14]:
var precipByYearInGermany = precipInGermany.join(yearsRDD)
var precipByYearInGermanyDF = precipByYearInGermany.map(row=>row._2).toDF("Precipitation","Year")

In [15]:
precipByYearInGermanyDF.show(5)

Brunel is invoked using the magic word **`%%brunel`**. Let us have a look at the annual precipitation of Germany.

In [16]:
%%brunel
data("precipByYearInGermanyDF") bar x(Year) y(Precipitation)
title("Precipitation in Germany") axes(x:'Year',y:'Precipitation (million cubic meters)') interaction(none)

## Raise Questions
We explored and prepared the annual precipitation data. Now, we can raise more complex questions and make use of the data to answer them.
## Question 1: Which countries or areas have the highest total precipitation?
The first step we must take to answer this question is to compute the sums for annual precipitation. This can be done by specifying the columns to be summed and using the `+` sign.

In [17]:

var totalPrecipByCountryDF=df.withColumn("Sum",df("1990") 
                                                                + df("1995") + df("1996")
                                                                + df("1997") + df("1998")
                                                                + df("1999") + df("2000")
                                                                + df("2001") + df("2002")
                                                                + df("2003") + df("2004") 
                                                                + df("2005") + df("2006") 
                                                                + df("2007") + df("2008")
                                                                + df("2009")).orderBy(desc("Sum"))

Now we have an **additional** column called "Sum" in `totalPrecipByCountryDF`, containing the sums of the corresponding annual precipitation values. Because we are interested in the countries or areas with the highest precipitation, we sorted the `DataFrame` by total precipitation. 

In [18]:
var precipitationByCountry = totalPrecipByCountryDF.select(totalPrecipByCountryDF("Country or Area"),
                                              totalPrecipByCountryDF("Sum"))

We can print the top 5 countries or areas with the highest total precipitation.

In [19]:
precipitationByCountry.show(5)

Our next goal is to plot the top 5 countries in a bar chart together for better comparison.

In [20]:
var top5TotalPrecipByCountry = totalPrecipByCountryDF.take(5)

In [21]:
top5TotalPrecipByCountry

In [22]:
var top5TotalPrecipByCountryRDD= sc.parallelize(totalPrecipByCountryDF.take(5))

In [23]:
var top5TotalPrecipByCountryTransposed = top5TotalPrecipByCountryRDD.collect().toSeq.map(
    r=>r.toSeq).transpose

In [24]:
top5TotalPrecipByCountryTransposed.take(2)

In [25]:
var top5PrecipitationCountries = sc.parallelize(top5TotalPrecipByCountryTransposed.map(r=>Row(r: _*)).drop(1))

In [26]:
var top5PrecipByCountryIndexed = top5PrecipitationCountries.zipWithIndex.map(_.swap)
var yearsIndexedWithRow = sc.parallelize(years).map(r => Row(r)).zipWithIndex.map(_.swap)
var pairRdd  = new PairRDDFunctions(top5PrecipByCountryIndexed)

Now we can join the top5PrecipByCountryIndexed and the yearsIndexedWithRow. Both objects
have the same structure (index,Row(Field))

In [27]:
var top5PrecipByCountryWithYear = pairRdd.join(yearsIndexedWithRow)

The result is a PairRddFunction object of Tuples with key-value objects where the value is an Array of two Rows.  

In [28]:
println(top5PrecipByCountryWithYear.take(2)(1))

(14,([6200000.0,2545813.0,1293661.5,0.0,0.0],[2008]))


You can apply a map function to combine the year and the precipitation values rows and remove the indexing. 

In [29]:
var top5PrecipByCountryWithYearArray = top5PrecipByCountryWithYear.map(r=>r._2).collect().map(
                                                    p=> p._2.toSeq ++ p._1.toSeq).map(r=>Row(r:_*))

In [30]:
top5PrecipByCountryWithYearArray.take(1)

Array([1998,0.0,2501223.0,1520000.0,0.0,951060.0])

Now you can create a new dataframe holding the years and precipitation values for the top5 countries. The schema is defined below.

In [31]:
import org.apache.spark.sql._
import org.apache.spark.sql.types._

val schema2 = StructType(
    Array(
    StructField("Year",StringType,true),
    StructField("China",DoubleType,true),
    StructField("Colombia",DoubleType,true),
    StructField("Venezuela",DoubleType,true),
    StructField("India",DoubleType,true),
    StructField("Chile",DoubleType,true)
))

In [32]:
var precipByCountryWithYearDF = sqlctx.createDataFrame(sc.parallelize(top5PrecipByCountryWithYearArray),schema2)

In [33]:
precipByCountryWithYearDF.show()

+----+---------+---------+-----------+---------+-----------+
|Year|    China| Colombia|  Venezuela|    India|      Chile|
+----+---------+---------+-----------+---------+-----------+
|1998|      0.0|2501223.0|  1520000.0|      0.0|   951060.0|
|2008|6200000.0|2545813.0|  1293661.5|      0.0|        0.0|
|1990|      0.0|2190672.0|  1630000.0|      0.0|  1152225.0|
|2000|6009200.0|2529872.0|  1360000.0|      0.0|  1160289.0|
|2002|6261000.0|2044818.0|  1420000.0|      0.0|  1306221.0|
|2006|5784000.0|2509840.0|  1060000.0|4000000.0|  1181614.0|
|2004|5687600.0|2154068.0|  1520000.0|4000000.0|1272096.625|
|1996|      0.0|2599285.0|  1420000.0|      0.0|   897224.0|
|2007|5776300.0|2492041.0|1335846.125|4000000.0|800225.1875|
|2009|5596600.0|      0.0|   936000.0|      0.0|        0.0|
|2005|6101000.0|2467871.0|  1560000.0|4000000.0| 1231319.25|
|1995|      0.0|2399888.0|  1520000.0|      0.0|   971392.0|
|1997|      0.0|1945543.0|  1470000.0|      0.0|  1315783.0|
|2001|5812200.0|2059180.

Brunel supports multi-layering so you can take advantage of it in the following way. It also supports interactivity. Hover over the points on the graph to get the precipitation value for the corresponding year. 

In [34]:
%%brunel
data("precipByCountryWithYearDF") title("Top 5 Countries with highest precipitation") 
x(Year) y(Colombia,China,Venezuela,India,Chile) 
color(#series) axes(x:'Years',y:'Precipitation (million cubic meters)')
tooltip('Value for ',#series, ' is ',#values) 
+ line x(Year) y(Colombia,China,Venezuela,India,Chile) color(#series) 

//Resolution to https://github.com/Brunel-Visualization/Brunel/issues/104 should address the problem
//with the y-axis starting below 0

Now compare the annual precipitation values of the top 5 countries. China has the highest annual precipitation, followed by Colombia. It is also obvious that some values are missing, which makes comparison difficult. Now, we want to know the percentage values of the precipitation for the top 5 countries in relation to the rest. A pie chart is ideal to show this kind of information.

In [35]:
var top5 = precipitationByCountry.take(5)

In [36]:
var other = precipitationByCountry.collect().drop(5)

In [37]:
other

In [38]:
var totalPrecpitationForOther:Double =  0.0

for(x <- other)
{
    totalPrecpitationForOther += x(1).asInstanceOf[Double]
}

In [39]:
totalPrecpitationForOther

In [40]:
var top5WithOther = top5 :+ Row("Other", totalPrecpitationForOther)

In [41]:
import org.apache.spark.sql._
import org.apache.spark.sql.types._

val schema3 = StructType(
    Array(
    StructField("CountryOrArea",StringType,true),
    StructField("Precipitation", DoubleType,true)
))

var top5WithOtherDF = sqlctx.createDataFrame(sc.parallelize(top5WithOther),schema3)

In [42]:
top5WithOtherDF.show()

In [43]:
%%brunel
stack polar data("top5WithOtherDF") y(Precipitation) title("Annual precipitation percentage") 
color(CountryOrArea) label(Precipitation) percent(Precipitation)

In this pie chart one can recognize that from the 91 countries and areas we consider, nearly a quarter of the precipitation for the observed years was in China, and over half of the precipitation was in the top 5 countries.

# Question 2: Which countries show a negative trend in annual precipitation ?
Going through each row of a DataFrame and looking at the numbers is not a viable solution to determine trends. Plotting a bar chart for each country or area is possible, but inconvenient. One possiblity to determine trends in annual precipitation is to fit a line to the data points. Run the code cells below to see the bar chart and the trend for annual precipitation for Chile.


In [44]:
var chile = df.filter(df("Country or Area") === "Chile").drop("Country or Area")
var chileValues = chile.collect()(0).toSeq.toArray.map(_.asInstanceOf[Double])

//var chilePrecipitationValues = chileValues.map(_.toDouble)
val chileIndexed = chileValues.zipWithIndex.map(_.swap)
val chileIndexedPairRDD = new PairRDDFunctions(sc.parallelize(chileIndexed))
var precipByYearChile = chileIndexedPairRDD.join(sc.parallelize(yearsIndexed))
var precipByYearChileDF = precipByYearChile.map(row=>row._2).toDF("Precipitation","Year")

In [45]:
precipByYearChileDF.take(2)

In [46]:
%%brunel
data("precipByYearChileDF") bar x(Year) y(Precipitation) title("Precipitation in Chile") 
axes(x:'Years',y:'Precipitation (million cubic meters)') interaction(none)

In [47]:
precipByYearChileDF.show()

Exploring the data above, one can spot that the precipitation values for 2008 and 2009 are zero. You can filter 
these out by utilizing the filter function. 

In [48]:
var yearsChile = years.filter(x=> x.toInt<2008)
chileValues = chileValues.filter(r=>r>0.0)

In order to fit a line through the data points, we need to compute the slope and the y-intercept. Therefore, 
we define three new functions. 

In [49]:
/* This method computes the slope of a fitted line for given data points. */
def computeSlope( a:Array[Double], b:Array[Double]) : Double = {
     
     val xmean = a.sum / a.length
     val ymean = b.sum / b.length
     
     val xdelta = a.map(r=> r - xmean)
     val ydelta = b.map(w=> w - ymean)
     
     val productXY = (xdelta,ydelta).zipped.map((x,y) => x*y)
     val productXX = xdelta.map(x => x*x)
     
     return productXY.sum/productXX.sum
}

/* This method computes the y-intercept */
def computeYIntercept(a:Array[Double], b:Array[Double],slope:Double) : Double = {
    
    val xmean = a.sum / a.length
    val ymean = b.sum / b.length
     
    return ymean - (slope*xmean)
}

def bestFit(a:Array[Double], slope: Double, yIntercept:Double):Array[Double]  = {
  
  return a.map(w=>  (w * slope) + yIntercept)
}

To determine whether the trend is positive or negative, we only have to look at the slope. Of course, we had to exclude any datapoints correponding to value of 0.0.

In [50]:
var slope = computeSlope(yearsChile.map(_.toDouble),chileValues.map(_.toDouble))
println("Slope: " + slope)

In [51]:
var yIntercept = computeYIntercept(years.map(_.toDouble),chileValues.map(_.toDouble),slope)
println("y-Intercept: " + yIntercept)

In [52]:
var bestFitLine = bestFit(years.map(_.toDouble), slope,yIntercept)

We computed the best fit line so we are going to create a dataframe with three fields: precipitation, year, and 
the best fit line values. 

In [53]:
var bestFitLineIndexed = bestFitLine.zipWithIndex.map(_.swap)
var precipByYearChileBestLine = precipByYearChile.join(sc.parallelize(bestFitLineIndexed)).filter(r=>r._2._1._1>0)
var precip = precipByYearChileBestLine.values.map(r=>Row(r._1._1,r._1._2,r._2))

In [54]:
val schema4 = StructType(
    Array(
    StructField("Precipitation",DoubleType,true),
    StructField("Year",StringType,true),
    StructField("BestLine",DoubleType,true)
))

var precipBestLineDF = sqlctx.createDataFrame(precip,schema4)

In [63]:
precipBestLineDF.show()

+-------------+----+------------------+
|Precipitation|Year|          BestLine|
+-------------+----+------------------+
|     951060.0|1998|1120906.7870451212|
|    1152225.0|1990| 1077291.482628718|
|    1160289.0|2000| 1131810.613149222|
|    1306221.0|2002|1142714.4392533228|
|    1181614.0|2006|1164522.0914615244|
|  1272096.625|2004|1153618.2653574236|
|     897224.0|1996|1110002.9609410204|
|  800225.1875|2007|1169974.0045135748|
|   1231319.25|2005| 1159070.178409474|
|     971392.0|1995|  1104551.04788897|
|    1315783.0|1997|1115454.8739930708|
|    1239664.0|2001|1137262.5262012724|
|    1377889.0|2003|1148166.3523053732|
|    1083755.0|1999|1126358.7000971716|
+-------------+----+------------------+



In [64]:
%%brunel
data("precipBestLineDF") title("Precipitation Trend for Chile") 
x(Year:linear) y(Precipitation) 
axes(x:'Years',y:'Precipitation (million cubic meters)') interaction(none)
+ line x(Year:linear) y(BestLine) axes(x:'Years',y:'Precipitation (million cubic meters)') interaction(none)

NOTE: Can't adjust the unit of spacing between ticks on x axis. Therefore, the line of best fit has a little curve.

For Chile, we observe a positive trend in annual precipitation, despite the fact that during the last couple of years the annual precipitation decreased. This example shows you how to determine the trend for the annual precipitation of one country. Now we have to do it for all 91 countries or areas. Let us implement an automated way to determine the trends for all countries and areas.

In [57]:
totalPrecipByCountryDF.show(2)

We define a function that accepts a row holding precipatition values and the years measures were taken. It filters the years when the precipitation was zero. 

In [58]:
def filterZeroValuesAndAssociatedYears(row:Array[Double],years:Array[String]):Tuple2[List[String],List[Double]] = {
   
   var yearRow = (years,row).zipped
   var t:List[Tuple2[String,Double]] = yearRow.map(
   (x,y)=>new Tuple2(x,y)).toList
   
    var filtered = t.filter(r=>r._2>0)
   
    var year = filtered.map(r=>r._1)
    var precip = filtered.map(r=>r._2)
   
   return (year,precip)
}

In [59]:
var iterator = df.drop("Country or Area").rdd.toLocalIterator
var countries = df.select("Country or Area").rdd.collect().map(r=>r(0))

Next, we iterate through each row and pass it to the filterZeroValuesAndAssoicatedYears function which preprocesses the data before it is past to the computeSlope function.

In [60]:
import scala.collection.mutable.ArrayBuffer
var slopes = ArrayBuffer[Double]()

while(iterator.hasNext)
{
    val row = iterator.next()
    var filteredPair = filterZeroValuesAndAssociatedYears(row.toSeq.toArray.map(_.asInstanceOf[Double]),years)
    slopes += computeSlope(filteredPair._1.toArray.map(_.toDouble),filteredPair._2.toArray)
}

In [61]:
var slopeByCountry = (countries,slopes).zipped

Now we have an Array of Tuples, containing the slope of the fitted lines. If the value is positive, there is a positive trend for annual precipitation. If the value is negative, there is a negative trend for annual precipitation. Values near zero indicate a stable condition. 

To answer our second question, we have to determine all v with a negative value for the slope. The corresponding code is in the cell below.

In [62]:
var negativeTrendSlope = slopeByCountry.map((x,y)=> Tuple2(x,y)).filter(r=>r._2<0)
var negativeTrendCountries  = negativeTrendSlope.map(r=>println(r._1))

Andorra
Anguilla
Bahrain
Barbados
Central African Republic
China
China, Hong Kong SAR
Cuba
Denmark
Estonia
Guinea
Iraq
Israel
Italy
Kazakhstan
Latvia
Lithuania
Luxembourg
Marshall Islands
Monaco
Morocco
Paraguay
Poland
Portugal
Qatar
Republic of Moldova
Slovenia
Spain
Switzerland
Togo
Tunisia
Venezuela
Yemen


### Question 3: Which country or area diplays the steepest positive trend in precipitation?
We leave this question for you to be answered.

## Download Notebooks
After performing an analysis, you can download your results. In the **Menu Bar**, go to **File** and then to **Download as**. It is possible to download notebooks in various formats to your local file system. Then you can send your work and results to colleagues.

(C) Copyright IBM Corp. 2016