# Project: Portfolio - Final Project

**Instructions for Students:**

Please carefully follow these steps to complete and submit your assignment:

1. **Completing the Assignment**: You are required to work on and complete all tasks in the provided assignment. Be disciplined and ensure that you thoroughly engage with each task.
   
2. **Creating a Google Drive Folder**: If you don't previously have a folder for collecting assignments, you must create a new folder in your Google Drive. This will be a repository for all your completed assignment files, helping you keep your work organized and easy to access.
   
3. **Uploading Completed Assignment**: Upon completion of your assignment, make sure to upload all necessary files, involving codes, reports, and related documents into the created Google Drive folder. Save this link in the 'Student Identity' section and also provide it as the last parameter in the `submit` function that has been provided.
   
4. **Sharing Folder Link**: You're required to share the link to your assignment Google Drive folder. This is crucial for the submission and evaluation of your assignment.
   
5. **Setting Permission toPublic**: Please make sure your **Google Drive folder is set to public**. This allows your instructor to access your solutions and assess your work correctly.

Adhering to these procedures will facilitate a smooth assignment process for you and the reviewers.

**Description:**

Welcome to your final portfolio project assignment for AI Bootcamp. This is your chance to put all the skills and knowledge you've learned throughout the bootcamp into action by creating real-world AI application.

You have the freedom to create any application or model, be it text-based or image-based or even voice-based or multimodal.

To get you started, here are some ideas:

1. **Sentiment Analysis Application:** Develop an application that can determine sentiment (positive, negative, neutral) from text data like reviews or social media posts. You can use Natural Language Processing (NLP) libraries like NLTK or TextBlob, or more advanced pre-trained models from transformers library by Hugging Face, for your sentiment analysis model.

2. **Chatbot:** Design a chatbot serving a specific purpose such as customer service for a certain industry, a personal fitness coach, or a study helper. Libraries like ChatterBot or Dialogflow can assist in designing conversational agents.

3. **Predictive Text Application:** Develop a model that suggests the next word or sentence similar to predictive text on smartphone keyboards. You could use the transformers library by Hugging Face, which includes pre-trained models like GPT-2.

4. **Image Classification Application:** Create a model to distinguish between different types of flowers or fruits. For this type of image classification task, pre-trained models like ResNet or VGG from PyTorch or TensorFlow can be utilized.

5. **News Article Classifier:** Develop a text classification model that categorizes news articles into predefined categories. NLTK, SpaCy, and sklearn are valuable libraries for text pre-processing, feature extraction, and building classification models.

6. **Recommendation System:** Create a simplified recommendation system. For instance, a book or movie recommender based on user preferences. Python's Surprise library can assist in building effective recommendation systems.

7. **Plant Disease Detection:** Develop a model to identify diseases in plants using leaf images. This project requires a good understanding of convolutional neural networks (CNNs) and image processing. PyTorch, TensorFlow, and OpenCV are all great tools to use.

8. **Facial Expression Recognition:** Develop a model to classify human facial expressions. This involves complex feature extraction and classification algorithms. You might want to leverage deep learning libraries like TensorFlow or PyTorch, along with OpenCV for processing facial images.

9. **Chest X-Ray Interpretation:** Develop a model to detect abnormalities in chest X-ray images. This task may require understanding of specific features in such images. Again, TensorFlow and PyTorch for deep learning, and libraries like SciKit-Image or PIL for image processing, could be of use.

10. **Food Classification:** Develop a model to classify a variety of foods such as local Indonesian food. Pre-trained models like ResNet or VGG from PyTorch or TensorFlow can be a good starting point.

11. **Traffic Sign Recognition:** Design a model to recognize different traffic signs. This project has real-world applicability in self-driving car technology. Once more, you might utilize PyTorch or TensorFlow for the deep learning aspect, and OpenCV for image processing tasks.

**Submission:**

Please upload both your model and application to Huggingface or your own Github account for submission.

**Presentation:**

You are required to create a presentation to showcase your project, including the following details:

- The objective of your model.
- A comprehensive description of your model.
- The specific metrics used to measure your model's effectiveness.
- A brief overview of the dataset used, including its source, pre-processing steps, and any insights.
- An explanation of the methodology used in developing the model.
- A discussion on challenges faced, how they were handled, and your learnings from those.
- Suggestions for potential future improvements to the model.
- A functioning link to a demo of your model in action.

**Grading:**

Submissions will be manually graded, with a select few given the opportunity to present their projects in front of a panel of judges. This will provide valuable feedback, further enhancing your project and expanding your knowledge base.

Remember, consistent practice is the key to mastering these concepts. Apply your knowledge, ask questions when in doubt, and above all, enjoy the process. Best of luck to you all!


In [1]:
# @title #### Student Identity
student_id = "REA6HXRRQ" # @param {type:"string"}
name = "Ratih Dewi Setyo Jati" # @param {type:"string"}
drive_link = "https://drive.google.com/drive/folders/1Vrjou0HK9_7a2qA6abCbjXJA6WRDqzBU?usp=sharing"  # @param {type:"string"}
assignment_id = "00_portfolio_project"

## Installation and Import `rggrader` Package

In [13]:
%pip install rggrader
from rggrader import submit_image
from rggrader import submit

!pip install -q --upgrade \
    gradio \
    langchain \
    langchain-google-genai \
    langchain-community \
    pandas \
    pypdf \
    pdfplumber \
    faiss-cpu

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m28.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.9/50.9 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[?25h

## Working Space

In [22]:
import os
from glob import glob
import pandas as pd
import gradio as gr
from langchain_google_genai import ChatGoogleGenerativeAI, GoogleGenerativeAIEmbeddings
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_core.runnables import RunnablePassthrough
from langchain.docstore.document import Document
from google.colab import drive


# Mount Google Drive
drive.mount('/content/drive')

# 1. Fungsi untuk mendapatkan path dokumen
def get_document_paths(folder_path):
    """Mengambil semua path file PDF dan Excel dalam folder"""
    pdf_files = glob(os.path.join(folder_path, '*.pdf'))
    excel_files = glob(os.path.join(folder_path, '*.xlsx')) + glob(os.path.join(folder_path, '*.xls'))
    return pdf_files + excel_files

# 2. Fungsi untuk memproses dokumen
def load_and_process_documents(file_paths):
    documents = []

    for file_path in file_paths:
        if file_path.endswith('.pdf'):
            try:
                loader = PyPDFLoader(file_path)
                pdf_docs = loader.load()
                for doc in pdf_docs:
                    doc.metadata["type"] = "pdf_document"
                documents.extend(pdf_docs)
                print(f"✓ Berhasil memproses PDF: {os.path.basename(file_path)}")
            except Exception as e:
                print(f"× Gagal memproses PDF {os.path.basename(file_path)}: {str(e)}")
        elif file_path.endswith(('.xlsx', '.xls')):
            try:
                xls = pd.ExcelFile(file_path)
                for sheet_name in xls.sheet_names:
                    df = pd.read_excel(file_path, sheet_name=sheet_name)
                    if not df.empty:
                        content = f"SHEET: {sheet_name}\n{df.to_markdown()}"
                        documents.append(Document(
                            page_content=content,
                            metadata={
                                "source": file_path,
                                "sheet": sheet_name,
                                "type": "excel"
                            }
                        ))
                print(f"✓ Berhasil memproses Excel: {os.path.basename(file_path)}")
            except Exception as e:
                print(f"× Gagal memproses Excel {os.path.basename(file_path)}: {str(e)}")

    if not documents:
        raise ValueError("Tidak ada dokumen valid yang bisa diproses")

    # Split dokumen
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=1000,
        chunk_overlap=200
    )
    splits = text_splitter.split_documents(documents)

    # Buat vector store
    embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
    vectorstore = FAISS.from_documents(splits, embeddings)
    return vectorstore.as_retriever(search_kwargs={"k": 3})

# 3. Setup Gemini dengan model yang benar
os.environ["GOOGLE_API_KEY"] = "AIzaSyC4C22jUntWRkYJfQrR2k2RgRnL07FoIC0"

# Gunakan model yang benar
llm = ChatGoogleGenerativeAI(
    model="gemini-1.5-pro",  # Model terbaru yang didukung
    temperature=0.3
)

# 4. Template prompt
template = """ANALISIS JADWAL TIM:
Gunakan data berikut untuk menjawab pertanyaan:

{context}

Pertanyaan: {question}

Format jawaban:
- Tim: [nama tim]
- Hari: [hari]
- Detail: [informasi lengkap]
- Sumber: [file/sheet]"""
prompt = ChatPromptTemplate.from_template(template)
output_parser = StrOutputParser()

# 5. Proses dokumen
folder_path = "/content/drive/My Drive/Final Project Bootcamp/chatbot_docs/"
file_paths = get_document_paths(folder_path)
retriever = load_and_process_documents(file_paths)

# 6. RAG Chain
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | output_parser
)

# 7. Fungsi chat
def respond(message, history):
    try:
        return rag_chain.invoke(message)
    except Exception as e:
        return f"Maaf, terjadi kesalahan. Silakan coba lagi. Error: {str(e)}"

# 8. Launch Gradio
demo = gr.ChatInterface(
    respond,
    title="⭐️Chatbot Jadwal Tim⭐️",
    description="Tanyakan tentang jadwal tim disini"
)
demo.launch(share=True)



Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
✓ Berhasil memproses PDF: Pengumuman Libur Nasional dan Cuti Bersama 2025.pdf
✓ Berhasil memproses Excel: JADWAL HD PSS APRIL 2025.xlsx
✓ Berhasil memproses Excel: maping detail flightops.xlsx
✓ Berhasil memproses Excel: Jadwal Coverage + Onsite March 2025 - Tim Infraops System engineer-DBA.xlsx
✓ Berhasil memproses Excel: Jadwal Coverage ODE4.xlsx
✓ Berhasil memproses Excel: Jadwal Coverage tim Network OIO-2 Maret 2025.xlsx


  self.chatbot = Chatbot(


Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://46825cae93c4e41b60.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




In [None]:
# ✅ Path ke folder dokumen di Google Drive
FOLDER_PATH = "/content/drive/My Drive/Final Project Bootcamp/chatbot_docs/"

# ✅ Gemini API Key
genai.configure(api_key="AIzaSyC4C22jUntWRkYJfQrR2k2RgRnL07FoIC0")


## Submit Notebook

In [None]:
portfolio_link = "https://github.com/arrdsj/DocuBot"
presentation_link = "https://docs.google.com/presentation/d/1NNsNIM0Q6AC_zIojQHBdWvnOkkOHs_77/edit?usp=sharing&ouid=110604046736752212553&rtpof=true&sd=true"

question_id = "01_portfolio_link"
submit(student_id, name, assignment_id, str(portfolio_link), question_id, drive_link)

question_id = "02_presentation_link"
submit(student_id, name, assignment_id, str(presentation_link), question_id, drive_link)

'Assignment successfully submitted'

# FIN