# Introduction Data Science and Machine Learning [KIML3]

The aim of this workshop is to provide you with the basic knowledge and skills needed to get started with data analysis and machine learning using Python. We will dig in the Python programming language, try relevant libraries and discuss relevant strategies. On sample data sets we will deepen our knowledge and build up our own machine learning pipeline. In addition we compare different methods and algorithms on some generated data.

## Table of Contents

### Curriculum

1. [**Machine Learning Run-Through**](../ml/ml-workflow-with-iris.ipynb) <br>
    Using a simple example data set we will explore the general machine learning workflow.
    
2. [**Python Basics**](../python/python-basics.ipynb)<br>
    Learn the basics of the Python programming language.
        
3. [**Efficient Computing with numpy**](../python/python-scientific-numpy.ipynb)<br>
    Apply the `numpy` library to compute efficiently with large amounts of data.

4. [**Data Handling with pandas**](../python/python-data-handling-pandas.ipynb)<br>
    Learn to work with tabular data, supported by the `pandas` library.

5. [**Plotting with matplotlib**](../python/python-plotting.ipynb)<br>
    Visualize data with plots, using functions of `matplotlib`.

6. [**Introduction to Statistics**](../stats/stats-basics.ipynb)<br>
    First steps with statistics concepts needed for data analysis.

7. [**Fitting**](../stats/stats-fitting-long.ipynb)<br>
    General idea of probability distributions, fitting a model to your data and what can go wrong.

8. **Machine Learning Deep Dive**<br>
    Using a test data set learn and explore the machine learning workflow step by step.
    - [Part 1](../ml/ml-workflow-marbles-part-1.ipynb): Start with data import, data preparation, data exploration and feature selection/engineering.
    - [Part 2-basics](../ml/ml-workflow-marbles-part-2-only-basic-validation.ipynb): How to set up a ML model, train it and perform an initial validation.
    - [Part 2](../ml/ml-workflow-marbles-part-2.ipynb): How to set up a ML model, train it and perform an exhaustive validation and performance test on the trained model.
    - [Part 3](../ml/ml-workflow-marbles-part-3-exercises.ipynb): It's your Go! Who can get the best classifier?
    - [Unsupervised learning](../ml/ml-marbles-unsupervised-learning.ipynb): What about unsupervised learning on the data set?
    
9. [**ML Compendium**](../ml/ml-compendium.ipynb)<br>
    Using some generated data sets compare the performance of different machine learning algorithms.

### Exercises

1. [**Excercise: Museums of France**](../exercises/exercise-museums.ipynb)<br>
    An exercise with a clear task, requiring you to apply the learnings from the course.
   
2. [**Excercise: Titanic**](../exercises/exercise-titanic.ipynb)<br>
    An open-ended exercise to practice answering questions with data.

### Additional Resources

- [**Test Notebook**](../test.ipynb)<br>
    Verify that your Python stack is working.

- [**Jupyter Cheat Sheet**](../jupyter/cheatsheet.ipynb)<br>
    Some useful commands for Jupyter Notebook, mostly optional.

---
_This notebook is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/). Copyright Â© 2018-2024 [Point 8 GmbH](https://point-8.de)_