# Introduction and purpose of this document

Since this is my first time managing a repo ALONE, I believe it is important to keep an instruction document to record how to do some basic operations via command line, how to write markdown in Jupyter Notebook as in this document, and most importantly, how to work like a professional ENGINEER.

This document contains the following sections:
1. How to work with Git in VS Code
2. Libraries required for Machine Learning
3. How to do version control

How to work with Git
========
To clone a repo from Github, click the tree-like button (third from up to bottom) at the left hand side of VS Code, click "clone repository" and paste the URL copied from Github. Creating a new directory in the laptop finishes the cloning.
Before you create a new file (README.md), we should remember to run the following commands to make sure VS Code knows who we are.

`git config --global user.email "hi@example.com"`  

`git config --global user.nbame "this is my name"`









After this, we should know how to commit changes and push it to the remote repo. Let's say a new file is created and you want it to be "uploaded" to the repo. The following steps should be done:
1. Save it.
2. Stage the change by `git add .`
    1. This command stages all files at the same time
    2. If only one file is to be staged, `git add file1.py`
3. Commit by `git commit -m 'some description'` 
4. Push it by `git push`



# Overview of CUDA, PyTorch, TensorFlow, Keras, and Scikit-learn

## 1. CUDA (Compute Unified Device Architecture)
- **Developed by**: NVIDIA
- **Purpose**: A parallel computing platform and application programming interface (API) model that allows software developers to use GPUs (Graphics Processing Units) for general-purpose computing.
- **Key Features**:
  - Leverages NVIDIA GPUs to accelerate computing tasks.
  - Enables parallel execution of tasks in a highly efficient manner.
  - Popular for tasks such as deep learning, scientific computing, and simulations.
  - CUDA C/C++ is used to write programs that run on NVIDIA GPUs.
  - Supports libraries like cuDNN (for deep learning) and cuBLAS (for linear algebra).

## 2. PyTorch
- **Developed by**: Facebook's AI Research (FAIR)
- **Purpose**: An open-source deep learning framework that provides flexibility and speed in building neural networks.
- **Key Features**:
  - Dynamic computation graph, which allows changes to the graph during execution (eager execution).
  - Strong GPU support with CUDA for accelerated computations.
  - Native support for tensor computations, autograd, and optimizers.
  - Extensively used in academia and industry for research and production.

## 3. TensorFlow
- **Developed by**: Google Brain
- **Purpose**: An open-source machine learning framework primarily used for training and inference of deep neural networks.
- **Key Features**:
  - Static computation graph (define-and-run model), though TensorFlow 2.x supports eager execution as well.
  - Strong integration with GPUs and TPUs for efficient computation.
  - Extensive deployment capabilities, including TensorFlow Lite for mobile and TensorFlow.js for running models in the browser.
  - Supported by TensorFlow Hub for pre-trained models and TensorFlow Extended (TFX) for production pipelines.

## 4. Keras
- **Developed by**: François Chollet (now part of TensorFlow)
- **Purpose**: A high-level neural networks API written in Python that simplifies building and training deep learning models.
- **Key Features**:
  - Provides a user-friendly, modular interface for building neural networks.
  - Now integrated as `tf.keras` in TensorFlow, making it part of the TensorFlow ecosystem.
  - Supports multiple backends (TensorFlow, Theano, CNTK) but is tightly integrated with TensorFlow in the latest versions.
  - Allows rapid prototyping with simple model creation and training workflows.

## 5. Scikit-learn (sklearn)
- **Developed by**: The Scikit-learn community (originally from the Python community)
- **Purpose**: A machine learning library for Python that provides simple and efficient tools for data mining and data analysis.
- **Key Features**:
  - Offers algorithms for classification, regression, clustering, dimensionality reduction, and model selection.
  - Built on top of NumPy, SciPy, and matplotlib, making it highly compatible with the scientific Python ecosystem.
  - Includes tools for model validation and evaluation, such as cross-validation and grid search.
  - Ideal for traditional machine learning tasks (e.g., SVM, random forests, decision trees) rather than deep learning.

## Summary of Key Differences
| Framework | Type                  | Use Case                                           | Backend |
|-----------|-----------------------|----------------------------------------------------|---------|
| CUDA      | Parallel Computing     | Accelerating general-purpose computation with GPUs | NVIDIA  |
| PyTorch   | Deep Learning Library  | Research and development of deep learning models   | Tensor, GPU, CUDA |
| TensorFlow| Deep Learning Library  | Scalable machine learning and production models    | Tensor, GPU, TPU |
| Keras     | High-level API         | Simplifying neural network model development      | TensorFlow (default), other backends |
| Scikit-learn | Machine Learning Library | Classical machine learning tasks (e.g., regression, classification) | CPU, Scikit-backend |

---

## Resources
- [CUDA Documentation](https://developer.nvidia.com/cuda-zone)
- [PyTorch Documentation](https://pytorch.org/docs/stable/)
- [TensorFlow Documentation](https://www.tensorflow.org/docs)
- [Keras Documentation](https://keras.io/)
- [Scikit-learn Documentation](https://scikit-learn.org/stable/)


How to do version control
=======
This is an important topic since there are too many libraries required and sometimes they are not compatible to each other. This section operates with Anaconda Power Shell.

To check current enviornments created by Anaconda, type `conda env list` in Anaconda PowerShell.

Note that the default enviornment is `base`, sometimes we need to create new envs. Use `conda create --name myenv python=3.9`, in which `myenv` is the name for the new env and version of python is 3.9.

To switch to `myenv`, do `conda activate myenv`.

Once there is a set enviornment, we can do `conda env export > environment.yml` to save the enviornment in a yml file.

To use it, do `conda env create -f environment.yml` in another device.

