# Problem Statement

## **Business Context**

"Visit with Us," a leading travel company, is revolutionizing the tourism industry by leveraging data-driven strategies to optimize operations and customer engagement. While introducing a new package offering, such as the Wellness Tourism Package, the company faces challenges in targeting the right customers efficiently. The manual approach to identifying potential customers is inconsistent, time-consuming, and prone to errors, leading to missed opportunities and suboptimal campaign performance.

To address these issues, the company aims to implement a scalable and automated system that integrates customer data, predicts potential buyers, and enhances decision-making for marketing strategies. By utilizing an MLOps pipeline, the company seeks to achieve seamless integration of data preprocessing, model development, deployment, and CI/CD practices for continuous improvement. This system will ensure efficient targeting of customers, timely updates to the predictive model, and adaptation to evolving customer behaviors, ultimately driving growth and customer satisfaction.


## **Objective**

As an MLOps Engineer at "Visit with Us," your responsibility is to design and deploy an MLOps pipeline on GitHub to automate the end-to-end workflow for predicting customer purchases. The primary objective is to build a model that predicts whether a customer will purchase the newly introduced Wellness Tourism Package before contacting them. The pipeline will include data cleaning, preprocessing, transformation, model building, training, evaluation, and deployment, ensuring consistent performance and scalability. By leveraging GitHub Actions for CI/CD integration, the system will enable automated updates, streamline model deployment, and improve operational efficiency. This robust predictive solution will empower policymakers to make data-driven decisions, enhance marketing strategies, and effectively target potential customers, thereby driving customer acquisition and business growth.

## **Data Description**

The dataset contains customer and interaction data that serve as key attributes for predicting the likelihood of purchasing the Wellness Tourism Package. The detailed attributes are:

**Customer Details**
- **CustomerID:** Unique identifier for each customer.
- **ProdTaken:** Target variable indicating whether the customer has purchased a package (0: No, 1: Yes).
- **Age:** Age of the customer.
- **TypeofContact:** The method by which the customer was contacted (Company Invited or Self Inquiry).
- **CityTier:** The city category based on development, population, and living standards (Tier 1 > Tier 2 > Tier 3).
- **Occupation:** Customer's occupation (e.g., Salaried, Freelancer).
- **Gender:** Gender of the customer (Male, Female).
- **NumberOfPersonVisiting:** Total number of people accompanying the customer on the trip.
- **PreferredPropertyStar:** Preferred hotel rating by the customer.
- **MaritalStatus:** Marital status of the customer (Single, Married, Divorced).
- **NumberOfTrips:** Average number of trips the customer takes annually.
- **Passport:** Whether the customer holds a valid passport (0: No, 1: Yes).
- **OwnCar:** Whether the customer owns a car (0: No, 1: Yes).
- **NumberOfChildrenVisiting:** Number of children below age 5 accompanying the customer.
- **Designation:** Customer's designation in their current organization.
- **MonthlyIncome:** Gross monthly income of the customer.

**Customer Interaction Data**
- **PitchSatisfactionScore:** Score indicating the customer's satisfaction with the sales pitch.
- **ProductPitched:** The type of product pitched to the customer.
- **NumberOfFollowups:** Total number of follow-ups by the salesperson after the sales pitch.-
- **DurationOfPitch:** Duration of the sales pitch delivered to the customer.


# Model Building

In [4]:
# Create a folder for storing the model building files
import os

# Define base path
base_path = "/content/tourism_project"
subfolders = ["data", "model_building", "deployment"]

# Create folders
for folder in subfolders:
    os.makedirs(os.path.join(base_path, folder), exist_ok=True)

print("Project folders created successfully.")

Project folders created successfully.


In [5]:
from google.colab import files

uploaded = files.upload()

Saving tourism.csv to tourism.csv


In [6]:
import shutil

shutil.move("tourism.csv", "/content/tourism_project/data/tourism.csv")

'/content/tourism_project/data/tourism.csv'



*   Created the foundational folder structure for the project using os.makedirs(). This includes a master directory tourism_project/ with subfolders for data/, model_building/, and deployment/.




## Data Registration

In [7]:
!pip install -q datasets huggingface_hub

In [8]:
from huggingface_hub import login

login(token="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")

In [9]:
from huggingface_hub import HfApi

api = HfApi()
user = api.whoami()
print(f"Authenticated as: {user['name']}")

Authenticated as: sgonuru


In [10]:
# Load dataset and push to Hugging Face
import pandas as pd
from datasets import Dataset

# Load the dataset
df = pd.read_csv("/content/tourism_project/data/tourism.csv")

# Convert to Hugging Face Dataset
hf_dataset = Dataset.from_pandas(df)

# Push to Hugging Face Dataset Hub
hf_dataset.push_to_hub("sgonuru/visit-with-us-raw")

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ? shards/s]

Creating parquet from Arrow format:   0%|          | 0/5 [00:00<?, ?ba/s]

Processing Files (0 / 0)      : |          |  0.00B /  0.00B            

New Data Upload               : |          |  0.00B /  0.00B            

                              : 100%|##########|  119kB /  119kB            

No files have been modified since last commit. Skipping to prevent empty commit.


CommitInfo(commit_url='https://huggingface.co/datasets/sgonuru/visit-with-us-raw/commit/6783edeaf746b0dac597c83fce9efa6c2a236d7a', commit_message='Upload dataset', commit_description='', oid='6783edeaf746b0dac597c83fce9efa6c2a236d7a', pr_url=None, repo_url=RepoUrl('https://huggingface.co/datasets/sgonuru/visit-with-us-raw', endpoint='https://huggingface.co', repo_type='dataset', repo_id='sgonuru/visit-with-us-raw'), pr_revision=None, pr_num=None)



*   Uploaded the raw tourism.csv dataset to the data/ folder and registered it on the Hugging Face Dataset Hub using the datasets library.




## Data Preparation

In [11]:
#Load dataset from Hugging Face
from datasets import load_dataset
import pandas as pd
from sklearn.model_selection import train_test_split

dataset = load_dataset("sgonuru/visit-with-us-raw", split="train")
df = pd.DataFrame(dataset)



In [12]:
# Drop unnecessary columns
df.drop(columns=["CustomerID"], inplace=True)

# Handle missing values
df.dropna(inplace=True)

# Encode categorical variables (basic label encoding)
categorical_cols = df.select_dtypes(include="object").columns
df[categorical_cols] = df[categorical_cols].astype("category").apply(lambda x: x.cat.codes)



In [13]:
# Split into train and test sets
train_df, test_df = train_test_split(df, test_size=0.2, random_state=42, stratify=df["ProdTaken"])

# Saving locally
train_path = "/content/tourism_project/data/train.csv"
test_path = "/content/tourism_project/data/test.csv"

train_df.to_csv(train_path, index=False)
test_df.to_csv(test_path, index=False)



In [15]:
# Upload processed datasets to Hugging Face
from datasets import Dataset

train_dataset = Dataset.from_pandas(train_df)
test_dataset = Dataset.from_pandas(test_df)

train_dataset.push_to_hub("sgonuru/visit-with-us-train")
test_dataset.push_to_hub("sgonuru/visit-with-us-test")

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ? shards/s]

Creating parquet from Arrow format:   0%|          | 0/4 [00:00<?, ?ba/s]

Processing Files (0 / 0)      : |          |  0.00B /  0.00B            

New Data Upload               : |          |  0.00B /  0.00B            

                              : 100%|##########| 95.4kB / 95.4kB            

                              : 100%|##########| 95.4kB / 95.4kB            

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ? shards/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Processing Files (0 / 0)      : |          |  0.00B /  0.00B            

New Data Upload               : |          |  0.00B /  0.00B            

                              : 100%|##########| 28.0kB / 28.0kB            

CommitInfo(commit_url='https://huggingface.co/datasets/sgonuru/visit-with-us-test/commit/870fe6b5844102de8ca93dbac53059c063a23efe', commit_message='Upload dataset', commit_description='', oid='870fe6b5844102de8ca93dbac53059c063a23efe', pr_url=None, repo_url=RepoUrl('https://huggingface.co/datasets/sgonuru/visit-with-us-test', endpoint='https://huggingface.co', repo_type='dataset', repo_id='sgonuru/visit-with-us-test'), pr_revision=None, pr_num=None)



*   Loaded the raw dataset from the Hugging Face Dataset Hub and performed essential preprocessing steps, including dropping irrelevant columns, handling missing values, and encoding categorical features. The cleaned data was split into training and testing sets using stratified sampling to preserve class distribution. Both subsets were saved to the folder and re-uploaded to the Hugging Face Dataset Hub for consistent access across the MLOps pipeline.



## Model Training and Registration with Experimentation Tracking

In [16]:
!pip install -q scikit-learn xgboost mlflow huggingface_hub


[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m40.0/40.0 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.9/8.9 MB[0m [31m75.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m74.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m60.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m147.8/147.8 kB[0m [31m13.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m114.9/114.9 kB[0m [31m9.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m85.0/85.0 kB[0m [31m8.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.8/76.8 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [17]:
# Load train/test datasets from Hugging Face
from datasets import load_dataset
import pandas as pd

train_ds = load_dataset("sgonuru/visit-with-us-train", split="train")
test_ds = load_dataset("sgonuru/visit-with-us-test", split="train")

train_df = pd.DataFrame(train_ds)
test_df = pd.DataFrame(test_ds)



README.md: 0.00B [00:00, ?B/s]

data/train-00000-of-00001.parquet:   0%|          | 0.00/95.4k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/3302 [00:00<?, ? examples/s]

README.md: 0.00B [00:00, ?B/s]

data/train-00000-of-00001.parquet:   0%|          | 0.00/28.0k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/826 [00:00<?, ? examples/s]

In [18]:
# Separate features and target
X_train = train_df.drop("ProdTaken", axis=1)
y_train = train_df["ProdTaken"]

X_test = test_df.drop("ProdTaken", axis=1)
y_test = test_df["ProdTaken"]

In [19]:
# Train and tune a model (Random Forest)
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report
from sklearn.model_selection import GridSearchCV
import mlflow
import mlflow.sklearn

# Enable MLflow autologging
mlflow.sklearn.autolog()

with mlflow.start_run():
    # Define model and hyperparameters
    rf = RandomForestClassifier(random_state=42)
    param_grid = {
        "n_estimators": [100, 200],
        "max_depth": [5, 10],
        "min_samples_split": [2, 5]
    }

    grid = GridSearchCV(rf, param_grid, cv=3, scoring="accuracy", n_jobs=-1)
    grid.fit(X_train, y_train)

    # Evaluate
    y_pred = grid.predict(X_test)
    acc = accuracy_score(y_test, y_pred)
    print("Accuracy:", acc)
    print(classification_report(y_test, y_pred))

    # Log best model
    mlflow.sklearn.log_model(grid.best_estimator_, "best_model")

  return FileStore(store_uri, store_uri)
2025/11/23 15:59:17 INFO mlflow.sklearn.utils: Logging the 5 best runs, 3 runs will be omitted.


Accuracy: 0.8728813559322034
              precision    recall  f1-score   support

           0       0.87      0.99      0.93       667
           1       0.86      0.40      0.55       159

    accuracy                           0.87       826
   macro avg       0.87      0.69      0.74       826
weighted avg       0.87      0.87      0.85       826





In [20]:
#Save and push best model to Hugging Face Model Hub
import joblib
from huggingface_hub import create_repo, upload_file

# Save model to folder
model_path = "/content/tourism_project/model_building/best_model.pkl"
joblib.dump(grid.best_estimator_, model_path)

# Push to Hugging Face Model Hub
repo_id = "sgonuru/visit-with-us-model"
create_repo(repo_id, repo_type="model", exist_ok=True)

upload_file(
    path_or_fileobj=model_path,
    path_in_repo="best_model.pkl",
    repo_id=repo_id,
    repo_type="model"
)

Processing Files (0 / 0)      : |          |  0.00B /  0.00B            

New Data Upload               : |          |  0.00B /  0.00B            

  ...l_building/best_model.pkl:  21%|##        | 1.23MB / 5.88MB            

CommitInfo(commit_url='https://huggingface.co/sgonuru/visit-with-us-model/commit/7487f855277e94896661c6c79d105f93e464fd3a', commit_message='Upload best_model.pkl with huggingface_hub', commit_description='', oid='7487f855277e94896661c6c79d105f93e464fd3a', pr_url=None, repo_url=RepoUrl('https://huggingface.co/sgonuru/visit-with-us-model', endpoint='https://huggingface.co', repo_type='model', repo_id='sgonuru/visit-with-us-model'), pr_revision=None, pr_num=None)



*   The best-performing model from the hyperparameter tuning process was serialized using joblib and registered to the Hugging Face Model Hub. This helps with centralized access to the model artifact for deployment and future updates, supporting modular integration within the MLOps pipeline.


# Deployment

## Dockerfile

In [21]:
os.makedirs("tourism_project/deployment", exist_ok=True)

In [22]:
%%writefile tourism_project/deployment/Dockerfile
# Use a minimal base image with Python 3.9 installed
FROM python:3.9

# Set the working directory inside the container to /app
WORKDIR /app

# Copy all files from the current directory on the host to the container's /app directory
COPY . .

# Install Python dependencies listed in requirements.txt
RUN pip3 install -r requirements.txt

RUN useradd -m -u 1000 user
USER user
ENV HOME=/home/user \
	PATH=/home/user/.local/bin:$PATH

WORKDIR $HOME/app

COPY --chown=user . $HOME/app

# Define the command to run the Streamlit app on port "8501" and make it accessible externally
CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0", "--server.enableXsrfProtection=false"]

Writing tourism_project/deployment/Dockerfile


## Streamlit App

In [23]:
import pandas as pd

# Load the original raw dataset
df_raw = pd.read_csv("/content/tourism_project/data/tourism.csv")

# List of categorical columns to encode
categorical_columns = [
    "TypeofContact",
    "Occupation",
    "Gender",
    "MaritalStatus",
    "Designation",
    "ProductPitched"
]

# Create mapping dictionaries
category_mappings = {}

for col in categorical_columns:
    cat_series = df_raw[col].astype("category")
    mapping = dict(enumerate(cat_series.cat.categories))
    reverse_mapping = {v: k for k, v in mapping.items()}
    category_mappings[col] = reverse_mapping

# Display mappings
for col, mapping in category_mappings.items():
    print(f"{col} Mapping:", mapping)

TypeofContact Mapping: {'Company Invited': 0, 'Self Enquiry': 1}
Occupation Mapping: {'Free Lancer': 0, 'Large Business': 1, 'Salaried': 2, 'Small Business': 3}
Gender Mapping: {'Fe Male': 0, 'Female': 1, 'Male': 2}
MaritalStatus Mapping: {'Divorced': 0, 'Married': 1, 'Single': 2, 'Unmarried': 3}
Designation Mapping: {'AVP': 0, 'Executive': 1, 'Manager': 2, 'Senior Manager': 3, 'VP': 4}
ProductPitched Mapping: {'Basic': 0, 'Deluxe': 1, 'King': 2, 'Standard': 3, 'Super Deluxe': 4}


In [24]:
%%writefile tourism_project/deployment/app.py
import streamlit as st
import pandas as pd
import joblib
from huggingface_hub import hf_hub_download

# Load model from Hugging Face Model Hub
model_path = hf_hub_download(repo_id="your-username/visit-with-us-model", filename="best_model.pkl")
model = joblib.load(model_path)

st.title("Wellness Tourism Package Purchase Predictor")

st.markdown("Enter customer details to predict the likelihood of purchase.")

# Input form
with st.form("input_form"):
    age = st.number_input("Age", min_value=18, max_value=100)
    typeof_contact = st.selectbox("Type of Contact", list(contact_map.keys()))
    city_tier = st.selectbox("City Tier", [1, 2, 3])
    occupation = st.selectbox("Occupation", list(occupation_map.keys()))
    gender = st.selectbox("Gender", list(gender_map.keys()))
    marital_status = st.selectbox("Marital Status", list(marital_map.keys()))
    designation = st.selectbox("Designation", list(designation_map.keys()))
    number_of_trips = st.slider("Number of Trips per Year", 0, 20, 1)
    passport = st.selectbox("Has Passport", [0, 1])
    own_car = st.selectbox("Owns Car", [0, 1])
    number_of_children = st.slider("Number of Children Visiting", 0, 5, 0)
    pitch_score = st.slider("Pitch Satisfaction Score", 1, 5, 3)
    product_pitched = st.selectbox("Product Pitched", list(product_map.keys()))
    followups = st.slider("Number of Follow-ups", 0, 10, 1)
    duration = st.slider("Duration of Pitch (minutes)", 0, 60, 10)

    submitted = st.form_submit_button("Predict")

    if submitted:
        input_df = pd.DataFrame([{
        "Age": age,
        "TypeofContact": contact_map[typeof_contact],
        "CityTier": city_tier,
        "Occupation": occupation_map[occupation],
        "Gender": gender_map[gender],
        "MaritalStatus": marital_map[marital_status],
        "Designation": designation_map[designation],
        "NumberOfTrips": number_of_trips,
        "Passport": passport,
        "OwnCar": own_car,
        "NumberOfChildrenVisiting": number_of_children,
        "PitchSatisfactionScore": pitch_score,
        "ProductPitched": product_map[product_pitched],
        "NumberOfFollowups": followups,
        "DurationOfPitch": duration
    }])
        prediction = model.predict(input_df)[0]
        st.success(f"Prediction: {'Will Purchase' if prediction == 1 else 'Will Not Purchase'}")

Writing tourism_project/deployment/app.py


## Dependency Handling

In [25]:
%%writefile tourism_project/deployment/requirements.txt
pandas
scikit-learn
streamlit
joblib
huggingface_hub

Writing tourism_project/deployment/requirements.txt


# Hosting

In [27]:
# Hosting script to push deployment files to Hugging Face Space
from huggingface_hub import HfApi

api = HfApi()
space_repo = "sgonuru/visit-with-us-space"

# Create space if not exists
api.create_repo(repo_id=space_repo, repo_type="space", space_sdk="docker", exist_ok=True)

# Upload deployment files
api.upload_file(path_or_fileobj="tourism_project/deployment/app.py", path_in_repo="app.py", repo_id=space_repo, repo_type="space")
api.upload_file(path_or_fileobj="tourism_project/deployment/requirements.txt", path_in_repo="requirements.txt", repo_id=space_repo, repo_type="space")
api.upload_file(path_or_fileobj="tourism_project/deployment/Dockerfile", path_in_repo="Dockerfile", repo_id=space_repo, repo_type="space")

CommitInfo(commit_url='https://huggingface.co/spaces/sgonuru/visit-with-us-space/commit/d3c0a31b324ec970cd9b31e4a8843c752ef67613', commit_message='Upload Dockerfile with huggingface_hub', commit_description='', oid='d3c0a31b324ec970cd9b31e4a8843c752ef67613', pr_url=None, repo_url=RepoUrl('https://huggingface.co/spaces/sgonuru/visit-with-us-space', endpoint='https://huggingface.co', repo_type='space', repo_id='sgonuru/visit-with-us-space'), pr_revision=None, pr_num=None)



*   Containerized the Streamlit-based prediction interface using a custom Dockerfile. Defined all runtime dependencies in a requirements.txt file and implemented a user-facing app.py that loads the trained model from the Hugging Face Model Hub and accepts structured customer inputs for real-time prediction. The deployment artifacts were successfully pushed to a Hugging Face Space using the "docker" SDK, enabling public hosting and interaction with the model through a web interface.




# MLOps Pipeline with Github Actions Workflow

**Note:**

1. Before running the file below, make sure to add the HF_TOKEN to your GitHub secrets to enable authentication between GitHub and Hugging Face.
2. The below code is for a sample YAML file that can be updated as required to meet the requirements of this project.

In [28]:
!apt-get install git

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
git is already the newest version (1:2.34.1-1ubuntu1.15).
0 upgraded, 0 newly installed, 0 to remove and 41 not upgraded.


In [29]:
!git config --global user.email "shankar.gonuru@gmail.com"
!git config --global user.name "shankargonuru-creator"

In [30]:
!git clone https://github.com/shankargonuru-creator/tourism-mlops.git
%cd tourism-mlops

Cloning into 'tourism-mlops'...
remote: Enumerating objects: 3, done.[K
remote: Counting objects: 100% (3/3), done.[K
remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0 (from 0)[K
Receiving objects: 100% (3/3), done.
/content/tourism-mlops


In [33]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [41]:
!cp /content/drive/My\ Drive/Colab\ Notebooks/Python Learning/Assignment9_AML_MLOps/Shankar_Gonuru_AML_and_MLOps_Project.ipynb .

cp: cannot stat '/content/drive/MyDrive/Colab': No such file or directory
cp: cannot stat 'Notebooks/Python': No such file or directory
cp: cannot stat 'Learning/Assignment9_AML_MLOps/Shankar_Gonuru_AML_and_MLOps_Project.ipynb': No such file or directory


```
name: Tourism Project Pipeline

on:
  push:
    branches:
      - main  # Automatically triggers on push to the main branch

jobs:

  register-dataset:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Install Dependencies
        run: <add_code_here>
      - name: Upload Dataset to Hugging Face Hub
        env:
          HF_TOKEN: ${{ secrets.HF_TOKEN }}
        run: <add_code_here>

  data-prep:
    needs: register-dataset
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Install Dependencies
        run: <add_code_here>
      - name: Run Data Preparation
        env:
          HF_TOKEN: ${{ secrets.HF_TOKEN }}
        run: <add_code_here>


  model-traning:
    needs: data-prep
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Install Dependencies
        run: <add_code_here>
      - name: Start MLflow Server
        run: |
          nohup mlflow ui --host 0.0.0.0 --port 5000 &  # Run MLflow UI in the background
          sleep 5  # Wait for a moment to let the server starts
      - name: Model Building
        env:
          HF_TOKEN: ${{ secrets.HF_TOKEN }}
        run: <add_code_here>


  deploy-hosting:
    runs-on: ubuntu-latest
    needs: [model-traning,data-prep,register-dataset]
    steps:
      - uses: actions/checkout@v3
      - name: Install Dependencies
        run: <add_code_here>
      - name: Push files to Frontend Hugging Face Space
        env:
          HF_TOKEN: ${{ secrets.HF_TOKEN }}
        run: <add_code_here>

```

**Note:** To use this YAML file for our use case, we need to

1. Go to the GitHub repository for the project
2. Create a folder named ***.github/workflows/***
3. In the above folder, create a file named ***pipeline.yml***
4. Copy and paste the above content for the YAML file into the ***pipeline.yml*** file

## Requirements file for the Github Actions Workflow

## Github Authentication and Push Files

* Before moving forward, we need to generate a secret token to push files directly from Colab to the GitHub repository.
* Please follow the below instructions to create the GitHub token:
    - Open your GitHub profile.
    - Click on ***Settings***.
    - Go to ***Developer Settings***.
    - Expand the ***Personal access tokens*** section and select ***Tokens (classic)***.
    - Click ***Generate new token***, then choose ***Generate new token (classic)***.
    - Add a note and select all required scopes.
    - Click ***Generate token***.
    - Copy the generated token and store it safely in a notepad.

In [None]:
# Install Git
!apt-get install git

# Set your Git identity (replace with your details)
!git config --global user.email "<-------GitHub Email Address------->"
!git config --global user.name "<--------GitHub UserName--------->"

# Clone your GitHub repository
!git clone https://github.com/<--------GitHub UserName--------->/<--------GitHub Reponame--------->.git

# Move your folder to the repository directory
!mv /content/tourism_project/ /content/<--------GitHub Reponame--------->

In [None]:
# Change directory to the cloned repository
%cd <--------GitHub Reponame--------->/

# Add the new folder to Git
!git add .

# Commit the changes
!git commit -m "first commit"

# Push to GitHub (you'll need your GitHub credentials; use a personal access token if 2FA enabled)
!git push https://<--------GitHub UserName--------->:<--------GitHub Token--------->@github.com/<--------GitHub UserName--------->/<--------GitHub Reponame--------->.git

# Output Evaluation

- GitHub (link to repository, screenshot of folder structure and executed workflow)

- Streamlit on Hugging Face (link to HF space, screenshot of Streamlit app)