<a href="https://colab.research.google.com/github/DanRHowarth/Artificial-Intelligence-Cloud-and-Edge-Implementations/blob/master/AI_Cloud_and_Edge_Implementations_Assessment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AI - Cloud and Edge Implementations - Assessment

## 1. Introduction

* Prior to taking this assessment, you should have completed the guided tutorials.
* To assess what you have learned, this notebook will take you through an end-to-end machine learning challenge and ask you to both implement code and to research and summarise key aspects of the machine learning process.

* Some points:
  * I will provide references to the course textbook so that you can refer to this as part of your research. You are free to use other resources, but please let me know what they are in your answers.
  * I will not be providing code. Feel free to use, or reuse some of the code provided already. For learning purposes, I would recommend that you do not copy and paste, and that you comment each line so that you know what is going on in the code.

* What we will assess:
  * I have provided assessment criteria:

![alt text](https://github.com/DanRHowarth/Artificial-Intelligence-Cloud-and-Edge-Implementations/blob/master/Screenshot%202019-12-12%20at%2021.16.45.png?raw=true)

  * We will cover your ability to understand and implement:
    * Exploratory Data Analysis
    * Understand which algorithms to use for a given problem
    * Select and implement appropriate metrics for assessing an algorithm
    * Model Selection and Evaluation, including when to use techniques such as cross validation 
    * An overall understanding of the different parts of machine learning and how it all fits together.
    * Optionally, More advanced techniques such as model pipelines, and ensemble methods and the chance to explore an algorithm in depth.

* We need to start with a dataset...
  * we will use the `wine` datatset that is part of the `scikit-learn` dataset class. This makes it easy for us to load in and to focus on skills. It is also used in the course textbook so provides opportunities for you to bring across techniques used in the book.
  * If you want to use your own, or a different dataset, please do (let me know though). The important thing is that we are testing skills and techniques, so please just answer the questions below as normal. 

In [2]:
# get dataset
from sklearn.datasets import load_wine

# we load the dataset and save it as the variable data
data = load_wine()

# if we want to know what sort of detail is provided with this dataset, we can call .keys()
data.keys()

dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names'])

In [0]:
# more code here 


## 2. Exploratory Data Analysis

*  I summarised some of the main parts of EDA here:

![alt text](https://github.com/DanRHowarth/Artificial-Intelligence-Cloud-and-Edge-Implementations/blob/master/Screenshot%202019-11-11%20at%2022.02.10.png?raw=true)

* You can also find more detail in the Raschka book, Chapter 4, p.109, and in the notebooks already provided. 

**2.1: Question**
* Can you set out what the key aspects of EDA are? What should it cover and what techniques can you use?

**2.1: Answer**
* Answer here

**2.2: Question**
* Using the dataset and the techniques discussed, implement EDA below. Discuss your findings and how it influences and impacts on your modelling choices.

In [0]:
# code here


## 3. Algorithm Selection

* Sections 3 - 5 of this notebook will refer to readings and ask questions that require written answers. Code implementation of these answers is in Section 6. The reason for doing it this way is that the techiques are very much intertwined, so splitting up the code implementations felt unnatural.
* A suggested way of working therefore is to go through the readings and answer questions below, while aslo implementing code under Section 6. That way you can learn while implementing and testing out code. 

* For section 3, refer to Raschka, Chapters 2, p.19, and 3, p.53, as well as the provided `scikit learn` tutorial.

**3.1: Question**
* Given our problem, what are the sort of algorithms we can use? What are their benefits and limitations?

**3.1: Answer**
* Answer here

## 4. Model Evaluation and Selection

* For section 4, refer to Raschka, Chapter 6, p.191, as well as the provided `scikit learn` tutorial.

**4.1: Question**
* What are the differences betweed model evaluation and selection?
* What are the different techniques available to us? When would we use them?

**4.1: Answer**
* Answer here

## 5. Model Metrics

* For section 5, refer to Raschka, Chapter 6, p.211, as well as the provided `scikit learn` tutorial.

**5.1: Question**
* What are the most appropriate metrics for the models you have chosen and why?

**5.1: Answer**
* Answer here


##6. Implementation of Sections 3 - 5
* Use the cells below to implement a model evaluation and selection strategy on the algorithms you have chosen against the metrics you believe are most appropriate

In [0]:
# code here


In [0]:
# code here


In [0]:
# use as many cells as required 


## 7. Overall Understanding of ML

* In the class, we covered a framework for understanding how the pieces of ML fit together. 

![alt text](https://github.com/DanRHowarth/Artificial-Intelligence-Cloud-and-Edge-Implementations/blob/master/Screenshot%202019-11-11%20at%2022.01.44.png?raw=true)

* Do you agree with this framework? What is missing? Can you pull your own together? 

**7.1: Answer**
* Provide your thoughts on the ML framework and develop your own if you prefer to.

## 8. Optional Activities 

* For those of you who are already familiar with machine learning, or who just want to do some more learning, I would suggest looking at these topics:
  * explore one of the algorithms you used above in more depth. This could mean looking at the implementation of the algorithm on `scikit-learn` in more detail, or it could mean implementing it by hand. There are plenty of tutorials online, but please get in touch if you do not know where to start. 
  * explore the model pipelines features of `scikit-learn` in more detail. We covered this in the `scikit-learn` tutorial but there are more aspects to this that you could explore. Please see Rascka, Chapter 6, p. 191.
  * explore ensemble models, or other advanced models, in more details. Again, we covered this in the `scikit-learn` tutorial but there are more aspects to this that you could explore. Please see Rascka, Chapter 7, p. 223.
* Note that these exercises don't form part of your assessment, but are optional extras if you wish to learn more within the course environment and support network. 

In [0]:
# code here


## 9. Next Steps

* Once you have finished the notebook, please submit it by **28th February 2020** to me on slack, with your name saved on the filename.
* Get in touch via slack if you have any questions. I'm here to help, and to learn from you all too!

Thanks,

Dan