# What Is DevOps?
DevOps combines depelopment (Dev) and operations (Ops) to increase the efficiency, speed, and security of software development and delivery compared to traditional process. A more nimble software development lifecycle results in a competitive advantage for businesses and their customers. It is a concept that works in a loop.

DevOps deals with software development practice which combines development and operations. Meaning, it comprises of all the processes from development, testing, deployment and finally maintenance or operations. Most of the times, pipelines are built to carry out the process of deployment.

The development process is represented on the left hand side of the loop, and the operations process is represented on the right hand side of the loop.

Benefits of DevOps are,
1. Speed.
2. Improved collaboration.
3. Rapid deployment.
4. Quality and reliability.
5. Security.

# What Is MLOps?
MLOps is very much similar to DevOps. MLOps stands for Machine Learning Operations. MLOps is a core function of Machine Learning engineering, focused on streamlining the process of taking Machine Learning models to production, and then maintaining and monitoring them. MLOps is a collaborative function, often comprising of Data Scientists, DevOps Engineers, and IT.

### What is the difference between MLOps and DevOps?
MLOps is a set of engineering practives specific to Machine Learning projects that borrow from the more widely adopted DevOps principles in software engineering. While DevOps brings a rapid, continuously iterative approach to shipping software applications, MLOps borrows the same principles to take Machine Learning models to production. In both cases, the outcome is higher software quality, faster patching and releases, and higher customer satisfaction.

### Why is MLOps needed?
Productionizing Machine Learning is difficult. The Machine Learning lifecycle consists of many complex components such as data ingest, data prep, model training, model tuning, mdel deployment, model monitoring, explainability and much more. It also requires collaboration and hand-offs across teams, from Data Engineering to Data Science to ML Engineering. Naturally, it requires stringent operational rigor to keep all these processes synchronous and working in tandem. MLOps encompases the experimentation, iteration, and continuous improvement of Machine Learning lifecycle.

### What are the components of MLOps?
The span of MLOps in Machine Learning projects can as focused or expansive as the project demands. In certain cases, MLOps can encompass everything from the data pipeline to model production, while other projects may require MLOps implementation of only the model deployment process. A majority of enterprises deploy MLOps principles across the following,
- EDA: Iteratively explore, share, and prep data for the machine learning lifecycle by creating reproducible, editable, and shareable datasets, tables, and visualizations.
- Data prep and feature engineering: Iteratively transform, aggregate, and de-duplicate data to create refined features. Most importantly, make the features visible and shareable across data teams, leveraging a feature store.
- Model training and tuning: Use popular open source libraries such as scikit-learn and hyperopt to train and improve model performance. As a simpler alternative, use automated machine learning tools such as AutoML to automatically perform trial runs and create reviewable and deployable code.
- Model review and governance: Track model lineage, model versions, and manage model artifacts and transitions through their lifecycle. Discover, share, and collaborate across ML models with the help of an open source MLOps platform such as MLflow.
- Model inference and serving: Manage the frequency of model refresh, inference request times and similar production-specifics in testing and QA. Use CI/CD tools such as repos and orchestrators (borrowing devops principles) to automate the pre-production pipeline.
- Model monitoring: Automate permissions and cluster creation to productionize registered models. Enable REST API model endpoints.
- Automated model retraining: Create alerts and automation to take corrective action In case of model drift due to differences in training and inference data.

# What Is Git?
Git is a Version Control System (VCS). Git helps in tracking code over the time. Meaning, it helps in managing different versions of the code.

This is crucial, because in cases a crash is encountered in current version, an older version will be avaialble to fallback to.

Another important aspect of Git is that, it is a distributed Version Control System. Meaning, a lot of users will have the same or different version of the same code base, and the original or the master branch of the code base is unaffected. The advantages of distributed system are,
1. There will be no single point of failure.
2. The location of the latest stable version of the code is always known, because it is kept in a centralized place (the master branch).

# What Is GitHub?
GitHub is a system that is built on top of Git. Meaning, GitHub is an online service which contains the folder to which the local files are pushed to. This online folder is called as a repository or a repo.

This repo can be a public repo or a private repo.

This repo can be accessed by anyone who has the link and the access permission. This permission is granted by the owner of the repo.

There are multiple other such services similar to GitHub, like BitBucket, GitLab, etc.

The difference between Git and GitHub is that, Git is local, and GitHub is remote.

# Initial Setup
1. Install Git (`brew install git`).
2. Check the version of the Git, `git --version`.
3. Configure Git,
    - `git config --global user.name "<user name>"`
    - `git config --global user.email "user_email@xyz.com"`

# Repository
A repository is like a folder that contains all the files, packages, dependencies, that are required for, or are a part of the project.

A repository or a repo can be either a remote or local. A repo can either be public repo (accessible by anyone), or private (invite only).

A repository that resides on the user's local machine is called as a local repo. A repository that resides on GitHub is called as a remote repo.

Every time a new repo is created, a README.md file is also created along with it. The `.md` stands for markdown. Markdown is a text beaurtification language.

To create a repository on a local machine, using command line,
1. Goto the home directory, `cd ~`.
2. Create a new empty directory, `mkdir <project_name>`.
3. Navigate to the directory, `cd <project_name>`.
4. Initialize the repository, `git init`. This will create a hidden `.git` directory in the project folder.

If GitHub Desktop is being used, then open the app and select "*Create a New Repository on your Local Drive...*". Check the box "*Initialize this repository with a README*".

# What Is `.gitignore`?
`.gitignore` file is used in a git repository to ignore the files and directoried which are unnecessary to the project. These will be ignored by Git once the changes have been comitted to the remote repository.

### Creating a `gitignore` file
1. `cd ~`.
2. `cd <project_name>`.
3. `touch .gitignore`.
4. Type in the name of the files that Git should ignore. Each file name should be typed in on a new line.
5. `git add .gitignore`.
6. `git commit -m "<commit message>"`.
7. `git status`.
8. `git push -u origin master`.

Any file with a `.` at the beginning of the name will be created as a hidden file.

# States In Git
Git has 3 main states that the project file can reside in, modified, staged and committed.
- Modified means that the files have been changed but the changes have not been committed to the database yet.
- Staged means that the modified file has been marked in its current version to go into the next commit snapshot.
- Committed means that the data is safely stored in the local database.

The basic Git workflow is as follows,
1. Files are modified in the working tree.
2. Only the changes that should be a part of the next commit are selectively staged using the `git add` command. Or all the changes can be staged using `git add -all` or `git add -A`.
3. When a commit is performed using `git commit`, a snapshot of all the files that are in the staging area is permanantly stored in the Git directory.

The following statement describes the process very well,

"*If a particular version of a file is in the Git directory, it’s considered committed. If it has been modified and was added to the staging area, it is staged. And if it was changed since it was checked out but has not been staged, it is modified*".

# Adding Files To The Staging Area
This is a step that should be taken before committing the changes. The steps to add are,
1. `git status`.
2. `git add --all` or `git add -A` or `git add <file_name>`.
3. `git status`.

# Removing/ Unstaging Files From The Staging Area
The steps to unstage are,
1. `git status`.
2. `git reset` or `git reset <file_name>`.
3. `git status`.

# Committing Changes
Committing is like taking a snapshot of the code at a particular time. Each commit is like a version of the code. Consider the following,
- Say that you wrote 2 lines of code and made a commit. This will be the version 1 of the code.
- Then say you wrote 5 more lines of code and made a commit. This will be the version 2 of the code.
- Then say you worte 20 more lines of code and made another commit. This will be the version 3 of the code.

If version 3 fails, you will fallback to version 2.

In a nutshell, a commit is a checkpoint that is created in time. A commit can be made on a timely basis or on a task basis. Whenever a commit is made, a checkpoint is created in the local system.

Steps to commit,
1. `git status`.
2. `git add --all`.
3. `git commit -m "a message that helps in understanding the changes that are committed"`.
4. `git commit -a --allow-empty-message -m ""`.
5. `git status`.

# Pushing Changes From Local To Remote
`git push` is done in order to propagate all the commits made in the local to the remote repo.

Steps to push changes to the remote are,
1. `git status`.
2. `git add --all`.
3. `git status`.
4. `git commit -m "message"`.
5. `git status`.
6. `git push -u origin main` or `git push -u origin master` or `git push -u origin <branch_name>`.
7. `git log`.

# Pulling Changes From Remote To Local
`git pull` is done to propagate the changes done in the remote repo to the local. This is done if a contributor wants to update his/ her local repo with all the changes made in the remote repo since his/ her last pull. In simple words, this helps in bringing the local up to date with the remote.

`git pull` combines the functionality of `git fetch` and `git merge` in a single step.

To pull changes from remote to local,
1. `git status`.
2. `git pull` or `git pull <remote> <branch>`.
3. `git status`.

# Cloning A Remote Repo On The Local
`git clone` creates a local copy of a Git repository from a remote server. It initializes a new Git repo on the local machine and populates it with the contents of the remote app.

The way to perform this action is,
- `git clone <url> <local_path>`.

Any public repo on GitHub can be cloned in this way.

# Branching
Branching is a fundamental concept in Git that helps in creating parallel lines of development within a project. Each branch represents an independent line of work that enables to experiment with new features, fix bugs, or explore different approaches without affecting the main codebase. Branches help to track different version of the project and easily revert to a previous state if necessary.

It is possible to switch between branches to work on different tasks or features.

Create a new branch,
- `git branch <branch_name>`.

Switch to a branch,
- `git checkout <branch_name>`.

Create and switch to a new branch in a single step,
- `git checkout -b <branch_name>`.

Merge a branch in the current branch,
1. `git checkout <branch_to_which_the_current_branch_is_to_be_merged_with>`.
2. `git merge <name_of_the_branch_which_is_to_be_merged_with_this_branch>`.

Delete a local branch,
- `git branch -d <branch_name>`.

Delete a remote branch,
- `git push origin --delete <branch_name>`.

Best practices for branching,
- Keep branches up to date with the main branch to avoid merge conflict.
- Use meaningful branch names that clearly indicate the purpose of the branch.
- Regularly delete branches that are no longer needed.
- Consider using a branching strategy like Gitflow or GitHub Flow to standardize the branching workflow.

NOTE: Two contributors working on the same file in different branches will create conflicts.

# Forking And Contributing To Projects