<a href="https://colab.research.google.com/github/aaubs/ds-master/blob/main/notebooks/M1_Colab_GitHub_Drive_Kaggle_v2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Use Colab notebook, Google drive, Kaggle, and Github
Visit Google Colaboratory website
Click on New Notebook button. A blank notebook is initialized and opened

# Tool Usage Overview

This table provides an overview of the specific use cases for tools like Colab, GitHub, Hugging Face, GitHub Codespaces, Visual Studio Code, and Kaggle.

| Tool                   | Use Case Description                                                                                                                                               |
|------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Colab**              | - Ideal for running Jupyter notebooks in the cloud.                                                                                                                |
|                        | - Provides a free GPU for machine learning projects.                                                                                                               |
|                        | - Useful for collaborative data science and educational purposes.                                                                                                   |
| **GitHub**             | - Version control for tracking changes in source code.                                                                                                             |
|                        | - Collaborative platform for code review and project management.                                                                                                    |
|                        | - Hosting and documentation for open-source projects.                                                                                                               |
| **Hugging Face**       | - Platform for sharing and deploying machine learning models.                                                                                                       |
|                        | - Provides a large repository of pre-trained models, especially in natural language processing.                                                                     |
|                        | - Tools for training and fine-tuning machine learning models in the cloud.                                                                                          |
| **GitHub Codespaces**  | - Provides a development environment in the cloud, directly integrated with GitHub repositories.                                                                    |
|                        | - Supports a full Visual Studio Code editor in the browser.                                                                                                         |
|                        | - Useful for coding, debugging, and running applications without setting up a local environment.                                                                   |
| **Visual Studio Code** | - Popular code editor with extensive plugin support for programming languages and tools.                                                                           |
|                        | - Features integrated Git support for version control.                                                                                                             |
|                        | - Customizable and adaptable for various development needs, including remote development via extensions like Remote - SSH, Remote - Containers, and Remote - WSL. |
| **Kaggle**             | - A platform for data science competitions, datasets, and notebooks.                                                                                                |
|                        | - Offers a cloud-based work environment with free access to GPUs and TPUs.                                                                                          |
|                        | - Facilitates collaborative data science and provides opportunities for learning and career advancement through competition.                                        |

This table serves as a comprehensive guide to understanding the functionalities and primary applications of various popular development tools in the technology and data science landscape.


##Part 1: Mount Google Drive to Google Colab Notebook
Run the below script to mount your Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


<img src="https://raw.githubusercontent.com/aaubs/ds-master/main/data/Images/M1_Colab_GitHub_1.png" alt="Image" width="400" height="250"/>
<img src="https://raw.githubusercontent.com/aaubs/ds-master/main/data/Images/M1_Colab_GitHub_2.png" alt="Image" width="400" height="250"/>
<img src="https://raw.githubusercontent.com/aaubs/ds-master/main/data/Images/M1_Colab_GitHub_3.png" alt="Image" width="400" height="250"/>


## Part 2: Connect to Kaggle and download a file

To connect to Kaggle, authenticate, and download a file in Google Colab, you can follow these steps:



***Step 2.1: Create your Kaggle API Token:***
- Go to Your Profile and click on Edit Profile.
- Scroll the page until API section and click on Create New API Token button
- A file named ```kaggle.json``` will get downloaded containing your username and token key


<img src="https://raw.githubusercontent.com/aaubs/ds-master/main/data/Images/M1_Kaggle_GitHub_1.png" alt="Image" width="500" height="150"/>

***Step 2.2: Upload kaggle.json to Google Drive***
- Create a folder in Google Drive ( in my case I'm using: ```Kaggle``` ) where we will be storing our Kaggle Datasets
- Upload your downloaded ```kaggle.json``` file to the created folder

<img src="https://raw.githubusercontent.com/aaubs/ds-master/main/data/Images/M1_Kaggle_GitHub_2.png" alt="Image" width="400" height="200"/>

***Step 2.3: Configure Kaggle***

Below code will set the Kaggle configuration path to ```kaggle.json```.
> Note: If you have used different fol
der name or directory path for ```kaggle.json```, please use the same instead of /Kaggle in the below code

In [None]:
import os
os.environ['KAGGLE_CONFIG_DIR'] = "/content/drive/MyDrive/Kaggle"

***Step 2.4: Download the Kaggle datasets***

Now, you can download either normal dataset or competition dataset. Based on your requirements follow the below steps:
- Go to [Kaggle datasets Dashboard](https://www.kaggle.com/datasets/heptapod/titanic) and click on Copy API Command as shown:




<img src="https://raw.githubusercontent.com/aaubs/ds-master/main/data/Images/M1_Kaggle_GitHub_3.png" alt="Image" width="500" height="200"/>

- Your API Command will look like ```kaggle datasets download -d <username>/<datasets> or kaggle datasets download -d <datasets>```


In [None]:
!kaggle datasets download -d /content/drive/MyDrive/GitHub/titanic heptapod/titanic --unzip

Dataset URL: https://www.kaggle.com/datasets/heptapod/titanic
License(s): DbCL-1.0
Downloading titanic.zip to /content
  0% 0.00/10.8k [00:00<?, ?B/s]
100% 10.8k/10.8k [00:00<00:00, 21.4MB/s]


> Note: The datasets are downloaded as a zip file. You need to manually ```unzip``` the file. But, there is a keyword ```--unzip```used to instantly unzip the file after download and delete the zip file.

## Part 3: Using Raw Format Data from GitHub in Google Colab

The following steps will guide you through the steps of using raw format data directly from GitHub in your Google Colab notebooks.

***Step 3.1: Find the Data File on GitHub***

1. **Navigate to the GitHub repository** that contains the file you want to use.
2. **Find and click on the file** to open it.
3. **Open the file in Raw view** by clicking the “Raw” button at the top right of the file viewer.

***Step 3.2: Copy the Raw URL***

- The URL now in your browser’s address bar is the direct link to the raw data. Copy this URL.

***Step 3.3: Load Data in Google Colab***

Depending on the format of your data, you might use different methods to load it into Colab:



In [None]:
#https://github.com/DataSnowman/carprice/blob/master/dataset/carprice.csv

In [None]:
import pandas as pd

# Replace 'url' with your copied raw URL
url = 'https://raw.githubusercontent.com/aaubs/ds-master/main/data/CarPrice_Assignment.csv'
data = pd.read_csv(url)
data.head()

Unnamed: 0,car_ID,symboling,CarName,fueltype,aspiration,doornumber,carbody,drivewheel,enginelocation,wheelbase,...,enginesize,fuelsystem,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
0,1,3,alfa-romero giulia,gas,std,two,convertible,rwd,front,88.6,...,130,mpfi,3.47,2.68,9.0,111,5000,21,27,13495.0
1,2,3,alfa-romero stelvio,gas,std,two,convertible,rwd,front,88.6,...,130,mpfi,3.47,2.68,9.0,111,5000,21,27,16500.0
2,3,1,alfa-romero Quadrifoglio,gas,std,two,hatchback,rwd,front,94.5,...,152,mpfi,2.68,3.47,9.0,154,5000,19,26,16500.0
3,4,2,audi 100 ls,gas,std,four,sedan,fwd,front,99.8,...,109,mpfi,3.19,3.4,10.0,102,5500,24,30,13950.0
4,5,2,audi 100ls,gas,std,four,sedan,4wd,front,99.4,...,136,mpfi,3.19,3.4,8.0,115,5500,18,22,17450.0


## Part 4: Cloning and Pushing to GitHub Using VS Code

This tutorial provides step-by-step instructions on how to install Visual Studio Code and Git, and how to use them to clone and push to a GitHub repository. The instructions cover both macOS and Windows.

### Step 4.1: Install Visual Studio Code

#### For macOS:
1. Visit the [VS Code official website](https://code.visualstudio.com/) and download the stable build for macOS.
2. Open the downloaded `.zip` file and extract VS Code.
3. Drag `Visual Studio Code.app` to the `Applications` folder, making it available in the Launchpad.

#### For Windows:
1. Visit the [VS Code official website](https://code.visualstudio.com/) and download the stable build for Windows.
2. Run the downloaded `.exe` file and follow the installation prompts.
3. Ensure you select “Add to PATH” during installation to enable launching from the command line.

### Step 4.2: Install Git

#### For macOS:
1. Download the latest Git for macOS from the [Git website](https://git-scm.com/download/mac).
2. Follow the instructions to install Git. If you download a `.dmg` file, open it and follow the prompts to install Git.

#### For Windows:
1. Download the latest Git for Windows installer from the [Git website](https://git-scm.com/download/win).
2. Run the downloaded `.exe` file and follow the setup instructions.
3. Make sure to choose the recommended settings, especially for adjusting your PATH environment.

### Step 4.3: Configure Git

Open a terminal (macOS) or command prompt/Git Bash (Windows) and set your user name and email address with the following commands:

```bash
git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"
```

### Step 4.4: Clone a Repository Using VS Code

1. Open VS Code.
2. Access the Command Palette by going to the View menu and clicking on `Command Palette`
3. Type `Git: Clone` in the Command Palette and select it.
4. Enter the URL of the GitHub repository you want to clone and press `Enter`.
5. Select the directory where you want to save the repository and click `Select Repository Location`.
6. After the repository has been cloned, VS Code will ask if you want to open the cloned repository. Click `Open`.

### Step 4.5: Make Changes and Push to GitHub

1. Open the folder of the cloned repository in VS Code.
2. Make your desired changes to the files or add new files.
3. Commit your changes by entering a commit message in the message box and then clicking the checkmark icon at the top of the Source Control sidebar.
4. Push your changes to GitHub by clicking the `...` button in the Source Control sidebar, selecting `Push` from the dropdown menu.

### Notes

- Ensure that you have the necessary permissions to push to the repository if it is not owned by you.
- If you are pushing to GitHub for the first time, you may be prompted to authenticate with your GitHub credentials.



# Part 5: Codespaces using VS Code and TabNine

## Step 5.1: Install TabNine
 In your VS Code IDE, go to Extensions. Search for Tabnine Enterprise (Self-Hosted) and select it (don't mix it with the other Tabnine extension) and install the extension.

<img src="https://docs.tabnine.com/~gitbook/image?url=https%3A%2F%2F3436682446-files.gitbook.io%2F%7E%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252FY2qxVf5VTm3fmwP4B4Gx%252Fuploads%252Fgit-blob-b7381dba80646c164af243960b39dfad9663ab90%252Fvsc1.webp%3Falt%3Dmedia&width=768&dpr=2&quality=100&sign=f2f61826&sv=1" width="500">



## Step 5.2: Login to GitHubSpace Using TabNine
- Open the Command Pallet.
- Run: Tabnine: Sign in using auth token

<img src="https://docs.tabnine.com/~gitbook/image?url=https%3A%2F%2F3436682446-files.gitbook.io%2F%7E%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252FY2qxVf5VTm3fmwP4B4Gx%252Fuploads%252FVfFfzJ6QzB6QXOzwYER9%252Fsaas_auth_token_vsc_1.webp%3Falt%3Dmedia%26token%3D847e83ee-7861-4334-a419-da837895227e&width=768&dpr=2&quality=100&sign=b2a3b066&sv=1" width="500">


- The following popup will appear:

<img src="https://docs.tabnine.com/~gitbook/image?url=https%3A%2F%2F3436682446-files.gitbook.io%2F%7E%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252FY2qxVf5VTm3fmwP4B4Gx%252Fuploads%252FyxURDVNuC3hK2pJ6Vk6v%252Fsaas_auth_token_vsc_2.webp%3Falt%3Dmedia%26token%3Ded87d380-66a3-43dd-9c48-b972eb17e71a&width=768&dpr=2&quality=100&sign=936c359b&sv=1" width="500">

- If you already have an authentication token, click Sign in and skip to the relevant step below.
- If you don't already have a token, click Get auth token.
- The browser will open with the following screen for signing up, which includes a secret personal authentication token:
<img src="https://docs.tabnine.com/~gitbook/image?url=https%3A%2F%2F3436682446-files.gitbook.io%2F%7E%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252FY2qxVf5VTm3fmwP4B4Gx%252Fuploads%252FVL7SAZY9SuAMlvjOHzLy%252Fsaas_auth_token_jb_3.webp%3Falt%3Dmedia%26token%3D2df0c466-1eba-40c6-a848-9141c71b19f0&width=768&dpr=2&quality=100&sign=cbd18a65&sv=1" width="250">

- Copy the token and go back to your IDE.
- Paste your authentication token in the following popup and click Enter:

<img src="https://lh7-us.googleusercontent.com/YD9KAz4nlJYH8c5k1BCpQDqs-pnf-gqlJJv0MDZmnd8otSm7CvRZi32UaQxOF7wlWQZzv-G_XTUB7otAyqR7HRzPVwyVpgwHBsyoPEyspuWRPfV39ZlJ4s81sITxyes3Cqi1wRhPFkN3kiu5KwmVBgSAkw=s2048" width="500">



## 5.4 Create a GitHub Codespace in VS Code

Follow these steps to create a Codespace in Visual Studio Code:

- **Open Command Palette**:
   - Press `Ctrl+Shift+P` (or `Cmd+Shift+P` on macOS) to open the Command Palette.

- **Access Codespaces**:
   - Type `Codespaces: Create a New Codespace` into the Command Palette and select it from the dropdown menu.

- **Select Your Codespace**:
   - Select the branch you want to work on and the machine type you need based on the resources like CPU and memory.

- **Start Coding**:
   - Once the setup is complete, the Codespace will open in your web browser, or you can choose to open it in Visual Studio Code directly from the browser session.

