<table width="100%"><tr style="background-color:rgba(0, 0, 0, 0);">
<td style="border: 0px"><img height="65" width="172" src="assets/logo_wm_line.png"/></td>
<td style="border: 0px"><h2>Workshop on AI 2018 Tutorial</h2></td>
<td style="border: 0px"><img height="65" width="172" src="http://pyviz.org/assets/PyViz_logo_wm_line.png"/></td>
</tr></table>

## Welcome to the EarthML/PyViz tutorial!

This tutorial will show you how to get started using general-purpose, open-source Python tools for machine learning, analysis, and visualization of earth-science data. Python offers a combination of convenience (many libraries already available, most things already done for you) with depth (anything can be customized, automated, optimized, and scaled up). The main downside of using Python is trying to navigate the huge and bewildering variety of tools available, and here we will guide you to a set of well-supported tools that enable a flexible, high-performance, scalable workflow.

For machine learning, we focus on techniques that let you prepare earth-science data such as satellite images for processing by any of the various tools commonly used in Python, including scikit-learn and TensorFlow. We won't tell you what algorithm to run or what parameters to give it, but we'll show you how to get the data ready for you to make those choices in your particular application.

Along the way, we will make sure that all of the data can easily be exposed, accessed, and visualized, using tools that provide:

- Full functionality in browsers (supporting local ore remote usage not tied to a desktop environment)
- Full interactivity (inside and out of plots)
- Specification in Python (not JavaScript, JSON, or other languages)
- Quick but fully customizable plots that minimize the need for coding
- Support for data of any size (really!)
- Free movement of your code between Jupyter notebooks, batch processing, and standalone web dashboards

We'll be using a wide range of open-source Python libraries that support these goals, including the [Anaconda](http://anaconda.com)-supported tools from the [PyViz](http://pyviz.org) initiative:
[Panel](http://panel.pyviz.org),
[HoloViews](http://holoviews.org),
[GeoViews](http://geoviews.org),
[Bokeh](https://bokeh.pydata.org),
[Datashader](http://datashader.org), 
[Intake](http://intake.readthedocs.io), and 
[Param](http://param.pyviz.org).  This material is all available for later study, at the [EarthML](http://earthml.pyviz.org) website.

This tutorial will take you through all of the steps involved in typical analysis and prediction workflows, from data ingestion to delivering a complete shareable web dashboard with the results.

## Index and Schedule
<!-- Breakdown: 120 min, 60 min lunch, 180 min -->

- *Overview*
   * **10 min** &nbsp;[0 - Setup](http://earthml.pyviz.org): Setting up your system to run the tutorial.
   * **50 min** &nbsp;[1 - Introduction and Workflow](tutorial/01_Introduction_and_Workflow.ipynb) : Example full Python workflow laying out all the steps.<br><br>
   <!-- fetch timeseries Fluxnet data using intake, visualize it in a geo context, slice and dice it, run a regression on it, and build a Panel dashboard with results -->
   <!-- Emphasize that the tools all support a huge range of data types, a huge range of processing steps, many ML algorithms, and many viz options -->
   <!-- Shortcuts, not dead ends -->

- *Data ingestion*
   * **10 min** &nbsp;[2 - Data Ingestion with Intake](tutorial/01_Data_Ingestion_with_Intake.ipynb): Loading large data sets efficiently with intake, and immediately visualizing every step.
   * Topic: &nbsp;&nbsp;&nbsp;[Heat and Trees](topics/Heat_and_Trees.ipynb): Analysis of how trees affect heat distribution in urban areas<br><br>

- *Data preprocessing*
   * **10 min** &nbsp;[3 - Alignment and Preprocessing](tutorial/03_Alignment_and_Preprocessing.ipynb): How to prepare your data for a machine learning pipeline or simulator.
   * **10 min** &nbsp;[4 - Resampling](tutorial/04_Resampling.ipynb): Resampling large datasets.
   * Topic: &nbsp;&nbsp;&nbsp;[Walker_Lake](topics/Walker_Lake.ipynb): Visualizing the change in the NDVI over time for a great saline lake.
   * **30 min** &nbsp;*Exercise 1*
   * **60 min** &nbsp;*Lunch break*<br><br>
   
- *Running ML or other analysis or simulation tools*
   * **30 min** &nbsp;[5 - Machine Learning](tutorial/05_Machine_Learning.ipynb): Specifying an ML pipeline using the prepared training data.
   <!-- Use carbon flux with simple ML algorithm?       
   * Topic: &nbsp;&nbsp;&nbsp;[LANDSAT Spectral Clustering](topics/Landsat_Spectral_Clustering.ipynb): Unsupervised clustering of LANDSAT data
   <!--	Other examples with tensorflow, keras, pytorch using CNN or some other algorithms? -->
   <!--	Interfacing to external code? -->
   * **30 min** &nbsp;*Exercise 2*
   * **20 min** &nbsp;[6 - Scaling Up](http://ml.dask.org/): Running large ML pipelines distributed across nodes.
   * **30 min** &nbsp;*Exercise 3*<br><br>
   <!-- Cover Datashader here? -->

- *Visualization and sharing*
   * **20 min** &nbsp;[7 - Visualization](tutorial/06_Data_Visualization.ipynb): In-depth examples of visualization topics
   <!-- Visualize every step; big data, small data, static, interactive, whatever, mentioning tile servers -->
   <!-- Demo EarthSim: Drawing tools/annotators,  Grabcut -->
   <!-- Datashader 8_Geography.ipynb? -->
   * **10 min** &nbsp;[8 - Sharing Results](http://pyviz.org/tutorial/A2_Dashboard_Workflow.ipynb): Sharing results as visualizations and interactive Panel apps and dashboards.
   <!-- Briefly show off a few dashboards (Glaciers, Birds, etc.) -->
   * **40 min** &nbsp;*Exercise 4*<br><br>
