In [None]:
# To run this notebook as a reveal.js presentation, run the following command in notebook's folder:
# `jupyter-nbconvert --to slides 1-python-in-one-hour-or-so.ipynb --reveal-prefix=reveal.js --post serve`

# EMERGING TECHNOLOGIES
# CHALLENGES & OPPORTUNITIES

# CHAPTER 1 ► GARTNER'S HYPE CYCLE 2017

Reference: https://goo.gl/bstq8e

![img/hype-cycle-gartner.png](img/hype-cycle-gartner.png)

## Key takeaways:

* **Heavy R&D spending from Amazon, Apple, Baidu, Google, IBM, Microsoft, and Facebook is fueling a race for Deep Learning and Machine Learning patents today and will accelerate in the future**

* **Artificial General Intelligence is going to become pervasive during the next decade, becoming the foundation of AI as a Service**

# CHAPTER 2 ► WHAT IS BIG DATA?

## Everybody has their own opinion ! So will give mine!

## The more or less conventional definition, the 5 Vs:

* **VOLUME**

* **VELOCITY**

* **VARIETY**

* **VERACITY**

* **VALUE**

Some add **Variability** and **Visualization**... Why not?

## BIG is a very relative notion!

![img/big-is-relative.png](img/big-is-relative.png)
Source: https://www.wired.com/2011/01/jumbo-shrimps-why-mega-mammals-still-looked-puny-next-to-the-biggest-dinosaurs/

## What I can not handle with my available toolbox is Big Data! `[Personal view]` 

* **Have you tried to open a `csv` file containing 10 million rows in Excel?**

* **Have you tried to visualize 72 million measurements on Google Earth?**

## The example of SAFECAST DATA VISUALIZATION 

![img/safecast-web.png](img/safecast-web.png)
https://blog.safecast.org/

## Mapping 72 million measurements at once

* **ATTEMPT 1 ► FAILURE! - Not enough RAM and not appropriate tool ► THIS IS BIG DATA FOR ME!**

* **ATTEMPT 2 ► SUCCESS! - Bought 16 Gb of RAM and used recent https://datashader.readthedocs.io Python package**

## With the right toolbox, takes 3s to render 72 million of points [MacBook Pro with 16 Gb RAM]

```python
def draw_map(df, plot_width, plot_height, colors, agg_func, interp, background_col):
    cvs = ds.Canvas(plot_width=plot_width, plot_height=plot_height)
    agg = cvs.points(df, 'lon', 'lat',  agg_func('value'))
    img = tf.shade(agg, cmap=colors, how=interp)
    return tf.set_background(img, color=background_col)

img = draw_map(df, plot_width, plot_height, inferno, ds.count, 'log', 'black')
```

![img/safecast-map.png](img/safecast-map.png)

# CHAPTER 3 ► BIG DATA INFRASTRUCTURE & FRAMEWORKS

## Batch vs. stream vs hybrid processing frameworks

* **Batch-only processing: Apache Hadoop**

* **Stream-only frameworks: Apache Storm, Apache Samza**

* **Hybrid frameworks: Apache Spark, Apache Flink**

Reference: https://www.digitalocean.com/community/tutorials/hadoop-storm-samza-spark-and-flink-big-data-frameworks-compared

Note: There is a Master in HPC at ICTP: http://www.mhpc.it/

## Cloud computing in Africa - ISOC's update?

## Google cloud regions

![img/google-cloud-regions.png](img/google-cloud-regions.png)

## Amazon Web Services cloud regions
![img/amazon-regions.png](img/amazon-regions.png)

## Microsoft Azure regions

![img/microsoft-azure.png](img/microsoft-azure.png)

## ... 

## Is Africa a forgotten continent?

## If that's the case, might change very soon!

## What about Cloud computing and Iot?

**Every cloud computing platforms targets the market and propose a dedicated offer** - for instance: https://goo.gl/zAR3KT

# CHAPTER 4 ► WHEN AND WHY DO WE NEED SUCH VOLUME OF DATA?

In [None]:
... 