In [34]:
from IPython.display import HTML

HTML('''<script>
code_show=false; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()"><input type="submit" value="Click here to toggle on/off the raw code."></form>''')

import os
from IPython.display import Markdown
from pathlib import Path
import frontmatter

# import the markdown file
course_introduction = frontmatter.load(os.path.join(Path.cwd(),'course-introduction.md'))
display(Markdown('# ' + course_introduction["title"])) 
display(Markdown(course_introduction.content))



# Course Introduction

## DARPA's AI Evolution Classification

If engineering difficulty has a pinnacle today this must be in AI domains that combines ML, optimal control and planning. Self-driving cars and humanoids from Boston Dynamics fit the bill. 

Initially there were rules.

* In the 1980s *knowledge-base* systems that hard-coded knowledge about the world in formal languages.
  * IF this happens, THEN do that.
* They failed to get any traction as the number of rules that are needed to model the real world exploded.
* However, they are still in use today in simpler modeling domains e.g. fault management. For example Rule Based Engines are used today in many complex systems that manage mission critical infrastructures e.g. [ONAP](http://wiki.onap.org).

The introduction of advanced AI methods few years ago, created a situation we can explain with the following analogy.

![Cumberland Basin, April 1844](images/nautical-analogy.png)
*A nautical analogy on where we are today on AI for mission critical systems. Can you notice anything strange with this ship (Cumberland Basin, photo taken April 1844)?*

To put order into the many approaches and methods for delivering AI in our lives, DARPA classified AI development in terms of "waves". 

<section class="bg-apple" >
  <div class="wrap">
    <div class="grid">
      <div class="column">
        <h4>Wave I: Symbolic GOFAI</h4>
        <div class="embed">
          <iframe src="https://www.youtube.com/embed/qnKSfY_RDOU"></iframe>
        </div>
      </div>
      <div class="column">
        <h4>Wave II: Connectionism</h4>
          <div class="embed">
            <iframe src="https://www.youtube.com/embed/1dBLLB2qasM"></iframe>
          </div>
      </div>
      <div class="column">
        <h4>Wave III: Artificial General Intelligence</h4>
        <div class="embed">
          <iframe src="https://www.youtube.com/embed/LikxFZZO2sk"></iframe>
          </div>
      </div>
    </div>
  </div>
</section>

In the 1980s Rule Based Engines started to be applied manifesting the first wave of AI introduction. In this example you see a system that performs highway trajectory planning. A combination of cleverly designed rules does work and offers real time performance but cannot *generalize* and therefore have acceptable performance in other environments.

Wave II srarted soon after 2010 - we started to apply a different philosophy in solving intelligent tasks such as object classification. The philosophy of connectionism and the so called deep neural network architectures, dominate today relative simple (and mostly self-contained) tasks.

Wave III is at present an active research area driven primarily from our inability to implement with just deep neural networks things like long-term goal planning, causality, extract meaning from text like humans do, explain the decisions of neural networks, transfer the learnings from one task to another, even similar, task. Artificial General Intelligence is the term usually associated with such capabilities.

Further, we will see a fusion of disciplines such as physical modeling and simulation with representation learning to help deep neural networks learn using data generated by domain specific simulation engines.

![heartflow.com](images/heartflow.png)
*Reveal the stenosis:Generative augmented physical (Computational Fluid Dynamics) modeling from Computer Tomography Scans*

For example in the picture above a CFD simulation is used to augment ML algorithms that predict **and explain** those predictions.  I mission critical systems (such as medical diagnostic systems) everything must be  **explainable**.


## AI - A Systems Approach

Artificial Intelligence is the multidisciplinary science that aim to create agents that can think and act humanly or rationally. This course starts the new decade filled with the promises of the previous one. AI is not around the corner as many people predicted, but our purpose here is to (a) understand and appreciate the significant progress that certain components of AI have made over the last few years. (b) to be able to synthesize such elements into AI systems.  

We also need to appreciate that AI systems are right now engineered to perform well within a domain and this trend may continue into the 20s. We need a domain that we can use as an application theme. Given the importance of the mission critical industries in the economy of every country, we have selected self-driving cars as the application domain. This domain requires the design of advanced agents that perceive the environment using noisy sensors, make decisions under uncertainty, actuate many drive-by-wire electronics to execute decisions, communicate with humans in natural language or be able to sense driver psychological state and many more.  

As you already know, a huge component of AI is machine learning (ML) and that component alone is worth of at least a couple of semesters. ML nowadays is used to process the visual sensing (computer vision), verbal commands (speech to text) and many other front-end functions using structures known as Deep Neural Networks (DNNs). These functions are usually effective in modeling the reflexive part of human brain. Their performance sometimes hides the enormous efforts by R&D teams to create carefully curated dataset for the task at hand. When supervised datasets are not enough for the design of reflexive agents policies we need additional tools that build on DNNs and offer the possibility to learn control policies from world models that in many instances take the shape of simulated environments. Deep Reinforcement Learning is another subset of learning that can go beyond reflexive models but at the same time it requires a model of the world and this in turn significant effort in building simulation engines.  

At the end of the day, AI is a system with the ability to represent the world and abstract concepts at multiple levels. If we are to draw the architecture of such system, it will have the ability to quickly change depending on the domain and task at hand. Just like origami, AI systems will morph into an architecture, facilitated by high speed interconnections between its subsystems. The controller that controls such changes must be topology aware i.e. knowing the functional decomposition of the AI system and what support for representations and abstractions each subsystem can offer. How these can be combined and ultimately used, is something that needs to be learned. To generalize, such morphic control agents must be designed to be able to perform across task domains.    

In a limited demonstration of such ability, closed worlds such as games, we have agents that can process reflexive DNN outputs or DRL policies and can create abstractions at the symbolic level. Are they able to generalize ? Doubtful. Which brings us to the very interesting problem. For the vast majority of mission critical industries that have to do with the control of machines, we may reach in this decade a good enough performance level. The internet didn't have 1 Gbps at each household even 5 years ago.  But the tens of kbps at the hands of innovators managed to change the world as we know it despite connectivity outages. The internet does not kill, many people will argue but if anyone believes this analogy, AI seems to be at the same silo as the before internet era of the early 90s. The protocol and controls that will allow siloed AI systems to communicate and by doing so demonstrate an ability to synthesize a first glimpse of general intelligence is one of the missing links. 

Lets now go over the course [syllabus](../..) to understand what elements of Wave II/III we will cover in this semester.

In [31]:
# import the markdown file
ai_stack_pipelines = frontmatter.load(os.path.join(Path.cwd(),'ai-stack-pipelines.md'))
display(Markdown('# ' + ai_stack_pipelines["title"])) 
display(Markdown(ai_stack_pipelines.content))

# AI Software Stack

## A typical AI stack today

As we have seen from the [syllabus](docs/syllabus), this course approaches supervised and unsupervised learning methods from an applied perspective - this means teaching concepts but at the same time looking how these concepts are applied in the industry to solve real world problems. In this respect here we take an architecture driven definiition of data mining, presenting the components of data mining in a form of a software stack but also how the components are mechanized in what we call **ML Pipelines** to provide the ML utility to applications. For a complete overview of real world ML pipelines used today go through the [TFX](http://stevenwhang.com/tfx_paper.pdf) paper in its entirety.

![AI stack](images/ai-stack.svg)*AI Stack circa 2019*

### Landscape of the AI ecosystem
Due to the complexity and common interest to addresses industrial players are partnering to define and implement the necessary components for the complete automation of AI pipelines.  This work is going in within the Linux Foundation and Deep Learning Foundation amongst many other open source communities.

<section class="bg-apple">
              <div class="wrap">
                  <h2>Linux Foundation - Deep Learning Foundation</h2>
          <iframe width="2120" height="630" src="https://landscape.lfdl.io/format=landscape&fullscreen=yes" frameborder="0" allowfullscreen></iframe>
          </div>
</section>

[Deep Learning Foundation ecosystem.](https://landscape.lfai.foundation/fullscreen=yes)
 

## The four pipelines of an end to end data mining platform

![E2E ML Pipeline](images/acumos-E2E.svg)
*Example of end to end pipeline - serial arrangement*

![Data Pipeline](images/acumos-DP1.svg)
*Example of Data Pipeline*

![Model Training Pipeline](images/acumos-MTP.svg)
*Example of Model Training Pipeline*

![Model Evaluation and Validation Pipeline](images/acumos-MEVP.svg)
*Example of Model Evaluation and Validation Pipeline*

![Serving Pipeline](images/acumos-SP.svg)
*Example of Serving Pipeline*

In [26]:
# import the markdown file
people_behind_ai = frontmatter.load(os.path.join(Path.cwd(),'people-behind-ai.md'))
display(Markdown('# ' + people_behind_ai["title"])) 
display(Markdown(people_behind_ai.content))

# The people behind AI

## Roles in AI product development

![Data scientists and other actors](images/acumos-actors.svg)
*Who data scientists need to interact with, during the development of an app?*


## Data Scientist - Engineering Job Description
This is a sample job description from Google. 

**Technologies**
machine-learning

**Job description**

Minimum qualifications:

* MS degree in a quantitative discipline (e.g., statistics, operations research, bioinformatics, economics, computational biology, computer science, mathematics, physics, electrical engineering, industrial engineering).
* 2 years of relevant work experience in data analysis or related field. (e.g., as a statistician / data scientist / computational biologist / bioinformatician).
* Experience with statistical software (e.g., R, Python, Julia, MATLAB, pandas) and database languages (e.g., SQL).

**Preferred qualifications:**

* PhD degree in a quantitative discipline as listed in Minimum Qualifications.
* 4 years of relevant work experience (e.g., as a statistician / computational biologist bioinformatician / data scientist), including deep expertise and experience with statistical data analysis such as linear models, multivariate analysis, stochastic models, sampling methods. Analytical engagements outside class work while at school can be included.
* Applied experience with machine learning on large datasets.
* Experience articulating business questions and using mathematical techniques to arrive at an answer using available data. Experience translating analysis results into business recommendations.
* Demonstrated skills in selecting the right statistical tools given a data analysis problem. Demonstrated effective written and verbal communication skills.
* Demonstrated leadership and self-direction. Demonstrated willingness to both teach others and learn new techniques.

**About the job**

As a Data Scientist, you will evaluate and improve Google's products. You will collaborate with a multi-disciplinary team of engineers and analysts on a wide range of problems. This position will bring analytical rigor and statistical methods to the challenges of measuring quality, improving consumer products, and understanding the behavior of end-users, advertisers, and publishers.

Google is and always will be an engineering company. We hire people with a broad set of technical skills who are ready to take on some of technology's greatest challenges and make an impact on millions, if not billions, of users. At Google, data scientists not only revolutionize search, they routinely work on massive scalability and storage solutions, large-scale applications and entirely new platforms for developers around the world. From Google Ads to Chrome, Android to YouTube, Social to Local, Google engineers are changing the world one technological achievement after another.

**Responsibilities**

* Work with large, complex data sets. Solve difficult, non-routine analysis problems, applying advanced analytical methods as needed. Conduct end-to-end analysis that includes data gathering and requirements specification, processing, analysis, ongoing deliverables, and presentations.
* Build and prototype analysis pipelines iteratively to provide insights at scale. Develop comprehensive understanding of Google data structures and metrics, advocating for changes where needed for both products development and sales activity.
* Interact cross-functionally with a wide variety of people and teams. Work closely with engineers to identify opportunities for, design, and assess improvements to google products.
* Make business recommendations (e.g. cost-benefit, forecasting, experiment analysis) with effective presentations of findings at multiple levels of stakeholders through visual displays of quantitative information.
* Research and develop analysis, forecasting, and optimization methods to improve the quality of Google's user facing products; example application areas include ads quality, search quality, end-user behavioral modeling, and live experiments.
At Google, we don’t just accept difference—we celebrate it, we support it, and we thrive on it for the benefit of our employees, our products and our community. Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also Google's EEO Policy and EEO is the Law. If you have a disability or special need that requires accommodation, please let us know by emailing candidateaccommodations@google.com

## Reearch Scientist, Machine Learning and Intelligence,  Job Description

This is another sample job description from Google. Spot the differences with the previous one. 

**Minimum Qualifications**

* PhD in Computer Science, related technical field or equivalent practical experience.
* Programming experience in one or more of the following: C, C++ and/or Python.
* Experience in Natural Language Understanding, Computer Vision, Machine Learning, Algorithmic Foundations of Optimization, Data Mining or Machine Intelligence (Artificial Intelligence).
* Contribution to research communities and/or efforts, including publishing papers at conferences such as NIPS, ICML, ACL, CVPR, etc.

**Preferred Qualifications**
* Relevant work experience, including experience working within the industry or as a researcher in a lab.
* Ability to design and execute on research agenda.
* Strong publication record.

**About The Job**
Research in machine intelligence has already impacted user-facing services across Google including Search, Maps and Google Now. Google Research & Machine Intelligence teams are actively pursuing the next generation of intelligent systems for application to even more Google products. To achieve this, we’re working on projects that utilize the latest techniques in Machine Learning (including Deep Learning approaches like Google Brain) and Natural Language Understanding.

We’ve already been joined by some of the best minds, and we’re looking for talented Research Scientists that have applied experience in the fields of Machine Learning, Natural Language Processing and Machine Intelligence to join our team.

We do research differently here at Google. Research Scientists aren't cloistered in the lab, but instead they work closely with Software Engineers to discover, invent, and build at the largest scale. Ideas may come from internal projects as well as from collaborations with research programs at partner universities and technical institutes all over the world. From creating experiments and prototyping implementations to designing new architectures, Research Scientists and Software Engineers work on challenges in machine perception, data mining, machine learning, and natural language understanding. You stay connected to your research roots as an active contributor to the wider research community by partnering with universities and publishing papers.

There is always more information out there, and Research and Machine Intelligence teams have a never-ending quest to find it and make it accessible. We're constantly refining our signature search engine to provide better results, and developing offerings like Google Instant, Google Voice Search and Google Image Search to make it faster and more engaging. We're providing users around the world with great search results every day, but at Google, great just isn't good enough. We're just getting started.

**Responsibilities**

* Participate in cutting edge research in machine intelligence and machine learning applications.
* Develop solutions for real world, large scale problems.

At Google, we don’t just accept difference - we celebrate it, we support it, and we thrive on it for the benefit of our employees, our products and our community. Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you have a disability or special need that requires accommodation, please let us know.

To all recruitment agencies Google does not accept agency resumes. Please do not forward resumes to our jobs alias, Google employees or any other company location. Google is not responsible for any fees related to unsolicited resumes

## A Note Conways' Law
> "Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure." http://www.melconway.com/Home/Conways_Law.html

"We do research differently here at Google. Research Scientists aren't cloistered in the lab, but instead they work closely with Software Engineers to discover, invent, and build at the largest scale."

Contrast this to an organizational structure that isolates researchers from product development.  What about Alphabet's X https://x.company/ ?

In [27]:
# import the markdown file
terminology = frontmatter.load(os.path.join(Path.cwd(),'terminology.md'))
display(Markdown('# ' + terminology["title"])) 
display(Markdown(terminology.content))

# Machine Learning - Key Terms

The performance of simpler machine learning algorithms depends heavily on the **representation** of the data they are given.  Each piece of, relevant to the problem, information that is included in the representation is known as a **feature**.

Modern ML systems **learn** the most suitable representations (still with a some help from data scientists) - an example is shown in the picture below. 

![Hierarchical Features](images/hierarchical-features-classification.png)
*Hierarchical Feature Learning*

In **supervised** learning we present a training set $\{ \mathbf{x}_1, \dots, \mathbf{x}_N \}$ together with their labels, the target vectors $\mathbf{y}$. 

![USPS](images/usps.png)
*Examples from the MNIST training dataset*

Our task is to construct a model such that a suitably chosen loss function is minimized for a *different* set of input data, the so-called test set. The ability to correctly *predict* / *classify* when observing the test set, is called **generalization**.
 
![Zillow](images/home-prices-area.png)
*Birdseye view of home prices - Zillow predicts prices for similar homes in the same market.*

In unsupervised learning, we present a training set $\{ \mathbf{x}_1, \dots, \mathbf{x}_N \}$  without labels. We construct a partition of the data into some number $K$ of **clusters**, such that a suitably chosen loss function is minimized for a *different* set of input data, the so-called test set.

![USPS](images/unsupervised.png)
*Clustering showing two classes and the exemplars per class*