# Project: Portfolio - Final Project

**Instructions for Students:**

Please carefully follow these steps to complete and submit your assignment:

1. **Completing the Assignment**: You are required to work on and complete all tasks in the provided assignment. Be disciplined and ensure that you thoroughly engage with each task.
   
2. **Creating a Google Drive Folder**: If you don't previously have a folder for collecting assignments, you must create a new folder in your Google Drive. This will be a repository for all your completed assignment files, helping you keep your work organized and easy to access.
   
3. **Uploading Completed Assignment**: Upon completion of your assignment, make sure to upload all necessary files, involving codes, reports, and related documents into the created Google Drive folder. Save this link in the 'Student Identity' section and also provide it as the last parameter in the `submit` function that has been provided.
   
4. **Sharing Folder Link**: You're required to share the link to your assignment Google Drive folder. This is crucial for the submission and evaluation of your assignment.
   
5. **Setting Permission toPublic**: Please make sure your **Google Drive folder is set to public**. This allows your instructor to access your solutions and assess your work correctly.

Adhering to these procedures will facilitate a smooth assignment process for you and the reviewers.

**Description:**

Welcome to your final portfolio project assignment for AI Bootcamp. This is your chance to put all the skills and knowledge you've learned throughout the bootcamp into action by creating real-world AI application.

You have the freedom to create any application or model, be it text-based or image-based or even voice-based or multimodal.

To get you started, here are some ideas:

1. **Sentiment Analysis Application:** Develop an application that can determine sentiment (positive, negative, neutral) from text data like reviews or social media posts. You can use Natural Language Processing (NLP) libraries like NLTK or TextBlob, or more advanced pre-trained models from transformers library by Hugging Face, for your sentiment analysis model.

2. **Chatbot:** Design a chatbot serving a specific purpose such as customer service for a certain industry, a personal fitness coach, or a study helper. Libraries like ChatterBot or Dialogflow can assist in designing conversational agents.

3. **Predictive Text Application:** Develop a model that suggests the next word or sentence similar to predictive text on smartphone keyboards. You could use the transformers library by Hugging Face, which includes pre-trained models like GPT-2.

4. **Image Classification Application:** Create a model to distinguish between different types of flowers or fruits. For this type of image classification task, pre-trained models like ResNet or VGG from PyTorch or TensorFlow can be utilized.

5. **News Article Classifier:** Develop a text classification model that categorizes news articles into predefined categories. NLTK, SpaCy, and sklearn are valuable libraries for text pre-processing, feature extraction, and building classification models.

6. **Recommendation System:** Create a simplified recommendation system. For instance, a book or movie recommender based on user preferences. Python's Surprise library can assist in building effective recommendation systems.

7. **Plant Disease Detection:** Develop a model to identify diseases in plants using leaf images. This project requires a good understanding of convolutional neural networks (CNNs) and image processing. PyTorch, TensorFlow, and OpenCV are all great tools to use.

8. **Facial Expression Recognition:** Develop a model to classify human facial expressions. This involves complex feature extraction and classification algorithms. You might want to leverage deep learning libraries like TensorFlow or PyTorch, along with OpenCV for processing facial images.

9. **Chest X-Ray Interpretation:** Develop a model to detect abnormalities in chest X-ray images. This task may require understanding of specific features in such images. Again, TensorFlow and PyTorch for deep learning, and libraries like SciKit-Image or PIL for image processing, could be of use.

10. **Food Classification:** Develop a model to classify a variety of foods such as local Indonesian food. Pre-trained models like ResNet or VGG from PyTorch or TensorFlow can be a good starting point.

11. **Traffic Sign Recognition:** Design a model to recognize different traffic signs. This project has real-world applicability in self-driving car technology. Once more, you might utilize PyTorch or TensorFlow for the deep learning aspect, and OpenCV for image processing tasks.

**Submission:**

Please upload both your model and application to Huggingface or your own Github account for submission.

**Presentation:**

You are required to create a presentation to showcase your project, including the following details:

- The objective of your model.
- A comprehensive description of your model.
- The specific metrics used to measure your model's effectiveness.
- A brief overview of the dataset used, including its source, pre-processing steps, and any insights.
- An explanation of the methodology used in developing the model.
- A discussion on challenges faced, how they were handled, and your learnings from those.
- Suggestions for potential future improvements to the model.
- A functioning link to a demo of your model in action.

**Grading:**

Submissions will be manually graded, with a select few given the opportunity to present their projects in front of a panel of judges. This will provide valuable feedback, further enhancing your project and expanding your knowledge base.

Remember, consistent practice is the key to mastering these concepts. Apply your knowledge, ask questions when in doubt, and above all, enjoy the process. Best of luck to you all!


In [1]:
# @title #### Student Identity
student_id = "REA02Y3M" # @param {type:"string"}
name = "Rendy Kurnia" # @param {type:"string"}
drive_link = "https://drive.google.com/drive/folders/15GzLv_Z5sTE_2CzdksD77MRJ236PGEP_"  # @param {type:"string"}
assignment_id = "00_portfolio_project"

## Installation and Import `rggrader` Package

In [2]:
%pip install -q rggrader
from rggrader import submit_image
from rggrader import submit

## Working Space

In [None]:
# Write your code here
# Feel free to add new code block as needed

In [3]:
# Install packages
!pip install -q langchain tiktoken faiss-cpu chromadb -U sentence-transformers==2.2.2 -U sentence-transformers==2.2.2 InstructorEmbedding pypdf textract

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m809.1/809.1 kB[0m [31m11.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m22.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m27.0/27.0 MB[0m [31m59.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m525.5/525.5 kB[0m [31m53.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m86.0/86.0 kB[0m [31m12.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m59.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m260.9/260.9 kB[0m [31m32.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m71.6/71.6 kB[0m [31m10.6 M

In [4]:
import os
from langchain.document_loaders import TextLoader
import textract
from pypdf import PdfReader
from langchain import HuggingFaceHub
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceInstructEmbeddings
from langchain.vectorstores import FAISS
from InstructorEmbedding import INSTRUCTOR
from langchain.chains import RetrievalQA, ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from google.colab import userdata
token = userdata.get("huggingface_write")

  from tqdm.autonotebook import trange


# Load documents

In [73]:
# Load pdf documents
documents = ""

reader = PdfReader('james-clear-transform-your-habits-v3.pdf')
for page in reader.pages:
    documents += page.extract_text()

In [64]:
documents[:300]

'\xa0\n \xa0\n\xa0\n \nTRANSFORM YOUR HABITS  \n3rd Edition \n \n \nNote from James Clear:  \n \nI wrote Transform Your Habits to create a free guide that would help people like \nyou make progress in health, business, and life. You are welcome to share it with \nanyone you think it would benefit. The latest version of T'

In [75]:
# Load txt documents
loader = TextLoader("/content/finance DS.txt")
pages = loader.load()
print(len(pages))
page = pages[0]

1


In [69]:
page.page_content[:300]

'What Does a Data Scientist\nin Finance Actually Do?\nBY WILL HILLIER, UPDATED ON AUGUST 31, 202311 mins read\nAs the backbone of the world’s economy, the finance sector has long understood the\nimportance of big data for making profitable decisions and taking calculated risks. That’s where\nfinancial dat'

In [76]:
# Combine 2 documents
documents = documents + '\n\n' + page.page_content

In [77]:
print(documents[:300])
print('')
print(documents[-300:])

 
  
 
 
TRANSFORM YOUR HABITS  
3rd Edition 
 
 
Note from James Clear:  
 
I wrote Transform Your Habits to create a free guide that would help people like 
you make progress in health, business, and life. You are welcome to share it with 
anyone you think it would benefit. The latest version of T

—complete with a job guarantee.  This month, we’re offering the first 100 students reduced tuition—worth up to $1,425 off—
on all of our career-change programs Book your application call and secure your spot
today!
Source: https://careerfoundry.com/en/blog/data-analytics/data-scientist-in-finance/


In [79]:
# Document Splitting
chunk_size = 200
chunk_overlap = 10

splitter = RecursiveCharacterTextSplitter(
    chunk_size=chunk_size,
    chunk_overlap=chunk_overlap
)
textSplit = splitter.split_text(documents)
textSplit = splitter.create_documents(textSplit)

# Embeddings

In [83]:
# Load embeddings instructor
instructor_embeddings = HuggingFaceInstructEmbeddings(
    model_name='hkunlp/instructor-xl', model_kwargs={'device':'cuda'}
)

# Implement embeddings
db = FAISS.from_documents(textSplit, instructor_embeddings)

# Save db
db.save_local("embeddings_DB")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


.gitattributes:   0%|          | 0.00/1.48k [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/270 [00:00<?, ?B/s]

2_Dense/config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/3.15M [00:00<?, ?B/s]

README.md:   0%|          | 0.00/66.3k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.52k [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/122 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/4.96G [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/2.20k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.42M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/2.40k [00:00<?, ?B/s]

modules.json:   0%|          | 0.00/461 [00:00<?, ?B/s]

load INSTRUCTOR_Transformer
max_seq_length  512


# Create Retrieval QA

In [None]:

args = parse_arguments()

In [None]:
# Load LLM
llm = HuggingFaceHub(
    repo_id="tiiuae/falcon-7b-instruct",
    model_kwargs={"temperature":1, "max_length":300},
    huggingfacehub_api_token=token
)

# Create the chatbot
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=db.as_retriever(), return_source_documents=True)

In [106]:
# Ask question
question = 'how to build a strong habit?'
response = qa({"query": question})
answer = response.get("result").split('Helpful Answer:')[1]
explanation = response.get("source_documents", [])
print(answer)
explanation


To build a strong habit, you need to use a current habit as the reminder for your new one. This is because it’s much easier to build a new habit if you’re already doing something similar.

For example, if you’re trying to build a new habit of exercising, you can use a current habit like brushing your teeth as the reminder. This way, you’re more likely to stick to your new habit because it’s already part of your routine


[Document(page_content='to prove it to ourselves.  \n \n \nIdentity-Based Habits: How to Build Lasting \nHabits  \n \nThe key to building lasting habits is focusing on creating a new identity first. Your'),
 Document(page_content='some credit and enjoy each small success.  \n \nRelated note: Make sure that the habits you are trying to build are actually important to'),
 Document(page_content='by encoding your new behavior in something that you already do, rather than \nrelying on getting motivated.  \n \nFor example, I created a new habit of flossing each day by always doing it after'),
 Document(page_content='How can you use this structure to create new habits and actually stick to them?  \n \nJamesClear.com Page 9 \n \xa0\n \nHere’s how…  \n \nStep 1: Use a Current Habit as the Reminder for \nYour New One')]

In [104]:
# Ask question
question = 'how to make myself feel happier?'
response = qa({"query": question})
answer = response.get("result").split('Helpful Answer:')[1]
explanation = response.get("source_documents", [])
print(answer)
explanation


1. Express gratitude: Take time each day to reflect on what you're thankful for.

2. Take care of your body: Eat healthy, exercise regularly, and get enough sleep.

3. Practice self-care: Do things that make you happy, such as reading a book or spending time with loved ones.

4. Make time for hobbies: Engage in activities that bring you joy and fulfillment.

5. Practice mindfulness: Focus on the present


[Document(page_content='For example, let’s say you want to feel happier. Expressing gratitude is one proven \nway to boost happiness. Using the list above, you could pick the reminder “sit'),
 Document(page_content='reward yourself: we want to continue doing things that make us feel good.  \n \n \nJamesClear.com Page 13 \n \xa0\nAnd that is why it’s especially important that you reward yourself each time you'),
 Document(page_content='★Floss one tooth. “Victory!”  \n★Eat a healthy meal. “Success!”  \n★Do five pushups. “Good work!”  \n \nRewarding yourself with positive self–talk can take some getting used to if you’re'),
 Document(page_content='If you want to start a new habit and begin living healthier and happier, then I have \none suggestion that I cannot emphasis enough: start small. In the words of Leo')]

In [115]:
# Ask question
question = 'What does keystone habit mean?'
response = qa({"query": question})
answer = response.get("result").split('Helpful Answer:')[1]
explanation = response.get("source_documents", [])
print(answer)
explanation

 Keystone habits are a set of daily activities that can help you achieve your long-term goals. They are the small things you do that can have a big impact on your life. For example, if you want to be more productive, you can use a keystone habit like setting a timer to work for 25 minutes and then taking a 5-minute break. This can help you stay focused and get more done.


[Document(page_content='A keystone habit is a behavior or routine that naturally pulls the rest of your \nlife in order.  \n \nI first heard about this idea in Charles Duhigg’s book, The Power of Habit.'),
 Document(page_content="Let's talk about what keystone habits are and how you can use them in your life.  \n \n \nThe Power of Keystone Habits"),
 Document(page_content='come a little bit easier. Exercise naturally pushes me towards my best self.  \n \n \nWhat Are Your Keystone Habits? \n \nImproving your lifestyle and becoming the type of person who “has their act'),
 Document(page_content="Well, I've got good news. Thanks to “keystone habits” you can actually focus on a \nsingle thing and improve your life in multiple areas at the same time.")]

In [116]:
# Ask question
question = 'Is it true that proving small wins is a way to sustain success?'
response = qa({"query": question})
answer = response.get("result").split('Helpful Answer:')[1]
explanation = response.get("source_documents", [])
print(answer)
explanation

 Yes, it is true. Proving small wins is a powerful way to sustain success. It helps you stay motivated and focused on your goals. By starting with small wins, you can build momentum and gradually increase the size of your successes. This can help you stay on track and maintain your progress over time.


[Document(page_content='2. Prove it to yourself with small wins.  \n \nNote:\u200b I cannot emphasize enough how important it is to start with incredibly small'),
 Document(page_content='Small win: Write one paragraph each day this week.  \n \nExample 3: Want to become strong?  \n \nIdentity: Become the type of person who never misses a workout.'),
 Document(page_content="success. But that's not what you need. You need better habits. \n \nIt’s so easy to overestimate the importance of one defining moment and"),
 Document(page_content='some credit and enjoy each small success.  \n \nRelated note: Make sure that the habits you are trying to build are actually important to')]

In [117]:
# Ask question
question = 'I want to start a new habit with a big step. Can it work?'
response = qa({"query": question})
answer = response.get("result").split('Helpful Answer:')[1]
explanation = response.get("source_documents", [])
print(answer)
explanation

 Yes, it can. Here's how.

1. Start small.
2. Make it easy.
3. Reward yourself.
4. Make it a habit.
5. Keep it up.

Here's an example:

1. Start small: Start with a small step that you can do easily.
2. Make it easy: Make it so that you can do it without much effort.
3. Reward yourself: Reward yourself for taking


[Document(page_content="new habit is the first step to making change easier.  \n \nThe reminder that you choose to initiate your new behavior is specific to your life \nand the habit that you're trying to create."),
 Document(page_content='How can you use this structure to create new habits and actually stick to them?  \n \nJamesClear.com Page 9 \n \xa0\n \nHere’s how…  \n \nStep 1: Use a Current Habit as the Reminder for \nYour New One'),
 Document(page_content='Taking the First Step to Breaking Bad Habits  \n \nIt’s easy to get caught up in how you feel about your bad habits. You can make'),
 Document(page_content='Your homework:\u200b Pick a new habit you want to start. Now ask yourself, “How can I \nmake this new behavior so easy to do that I can’t say no?”  \n \n \nStep 3: Always Reward Yourself')]

In [118]:
# Ask question
question = 'how to cook an egg?'
response = qa({"query": question})
answer = response.get("result").split('Helpful Answer:')[1]
explanation = response.get("source_documents", [])
print(answer)
explanation



To cook an egg, you need to bring water to a boil, then carefully lower the egg into the water. The egg will cook for about 5-7 minutes, depending on the size. Once the egg is cooked, you can remove it from the water and place it on a plate.


[Document(page_content='The Recipe for Sustained Success  \n \nChanging your beliefs isn’t nearly as hard as you might think. There are two steps.  \n \n1. Decide the type of person you want to be.'),
 Document(page_content='★Get in the shower.  \n★Put your shoes on.  \n★Brush your teeth.  \n★Flush the toilet.  \n★Sit down for dinner.  \n★Turn the lights off.  \n★Get into bed.'),
 Document(page_content='Step Two:\u200b You answer your phone (routine). This is the actual behavior. When your \nphone rings, you have a habit of answering it.'),
 Document(page_content='Over time, you can progressively move your bright line forward and add other \nbehaviors to the mix. (i.e. “I don’t eat red meat or fish.” And so on.)')]

In [121]:
# Ask question
question = 'What is a financial data scientist?'
response = qa({"query": question})
answer = response.get("result").split('Helpful Answer:')[1]
explanation = response.get("source_documents", [])
print(answer)
explanation

 A financial data scientist is an expert in analyzing and interpreting complex financial data to help businesses make better decisions. They use a variety of data analysis tools and techniques to uncover insights and trends in financial data. They are also responsible for developing predictive models and algorithms to help businesses anticipate future trends and make more informed decisions.


[Document(page_content='Financial data scientists are industry experts with specialist, in-depth domain knowledge. As one of the world’s most lucrative industries, the global finance sector was one of the first to identify'),
 Document(page_content='of a data scientist in the'),
 Document(page_content='In this post, we explored the ins-and-outs of data science within the finance industry. We learned that: \uf0b7 Financial data scientists work with the vast amounts of data available to financial'),
 Document(page_content='finance sector. But what exactly is a financial data scientist, and what does one do? In this post, we’ll answer all this')]

In [None]:
# Ask question
question = 'Who is the author?'
response = qa({"query": question})
answer = response.get("result").split('Helpful Answer:')[1]
explanation = response.get("source_documents", [])
print(answer)
explanation

 BJ Fogg


[Document(page_content='researchers. I originally learned of this cycle from Stanford professor, BJ Fogg. And \nmore recently, I read about it in Charles Duhigg’s best–selling book, The Power of \nHabit.'),
 Document(page_content='Do you see the difference?  \n \nI think the following quote from BJ Fogg, a professor at Stanford University, sums \nthis idea up nicely.'),
 Document(page_content='Example 2: Want to become a better writer?  \n \nJamesClear.com Page 19 \n \xa0\n \nIdentity: Become the type of person who writes 1,000 words every day.'),
 Document(page_content="(Duhigg’s book refers to the three steps as cue, routine, reward. Regardless of how \nit's phrased, the point is that there is a lot of science behind the process of habit")]

In [None]:
# Ask question
question = 'What is the book about?'
response = qa({"query": question})
answer = response.get("result").split('Helpful Answer:')[1]
explanation = response.get("source_documents", [])
print(answer)
explanation

 The book is about the science of habit formation and how to transform your habits to achieve your goals.


[Document(page_content='day is a new type of lifestyle.  \n★Publishing your first book would be life–changing, emailing a new book \nagent each day is a new type of lifestyle.'),
 Document(page_content="(Duhigg’s book refers to the three steps as cue, routine, reward. Regardless of how \nit's phrased, the point is that there is a lot of science behind the process of habit"),
 Document(page_content='A keystone habit is a behavior or routine that naturally pulls the rest of your \nlife in order.  \n \nI first heard about this idea in Charles Duhigg’s book, The Power of Habit.'),
 Document(page_content='TRANSFORM YOUR HABITS  \n3rd Edition \n \n \nNote from James Clear:  \n \nI wrote Transform Your Habits to create a free guide that would help people like')]

In [None]:
# References:
# - https://github.com/abidsaudagar/Private-Chatbot
# - Langchain project from Skill Academy Pro Bootcamp
# - https://medium.com/@abdullahw72/langchain-chatbot-for-multiple-pdfs-harnessing-gpt-and-free-huggingface-llm-alternatives-9a106c239975

## Submit Notebook

In [None]:
portfolio_link = ""
presentation_link = ""

question_id = "01_portfolio_link"
submit(student_id, name, assignment_id, str(portfolio_link), question_id, drive_link)

question_id = "02_presentation_link"
submit(student_id, name, assignment_id, str(presentation_link), question_id, drive_link)

# FIN