# RAG Application for Company Analysis

## Context
I am creating a Retrieval-Augmented Generation (RAG) application for analyzing a random company in the Indian market. The aim is to utilize this application to understand the company better and make informed judgments about its investability. The application is built around an in-depth analysis document of the company, providing comprehensive insights and facilitating better decision-making.

## Objectives
1. **In-depth Company Analysis**: The application provides detailed analysis of the company, covering various aspects such as financial performance, market position, growth prospects, and risks.
2. **Enhanced Understanding**: Users can interact with the RAG application to extract specific information and gain a deeper understanding of the company's performance and potential.
3. **Investment Decision Support**: The application assists users in making informed investment decisions by presenting relevant data and insights from the analysis document.

## Tech Stack

### Langchain Framework
The Langchain framework is used for building and managing the language model pipelines. It allows seamless integration of various components required for the RAG application, enabling efficient text generation and retrieval processes.

### OpenAI Credentials
OpenAI's API is utilized for the natural language processing and text generation capabilities. By leveraging OpenAI's advanced models, the application can provide accurate and contextually relevant responses to user queries.

### Vector Store
A vector store is employed to store and manage the embeddings of the analysis document. This facilitates efficient retrieval of relevant information based on user queries, enhancing the overall performance and accuracy of the RAG application.

## Features
- **Interactive Q&A**: Users can ask questions related to the company's analysis, and the application provides precise answers by retrieving relevant information from the document.
- **Detailed Insights**: The application presents comprehensive insights into various aspects of the company, helping users understand its strengths, weaknesses, opportunities, and threats.
- **Investment Analysis**: By analyzing key metrics and indicators, the application aids in making informed investment decisions regarding the company.
- **User-friendly Interface**: The application is designed to be intuitive and user-friendly, allowing users to interact seamlessly and obtain the required information with ease.

## Benefits
- **Enhanced Decision-Making**: The RAG application provides valuable insights and data, supporting users in making well-informed investment decisions.
- **Time Efficiency**: By automating the retrieval and generation of relevant information, the application saves users' time and effort in analyzing the company.
- **Accurate Analysis**: Leveraging advanced language models and vector storage, the application ensures the accuracy and relevance of the information presented.


### Load Dependencies and LLM model

In [16]:
# load required library
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import PyPDFLoader
from langchain.chains.question_answering import load_qa_chain


import os
from dotenv import load_dotenv

load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")


# Load the embedding and LLM model
embeddings_model = OpenAIEmbeddings()
model = ChatOpenAI(model_name = "gpt-3.5-turbo", max_tokens = 200)

### Load the transcript and split the text

In [3]:
with open("XYZ_company_analysis.txt") as file:
    transcription = file.read()

transcription[:100]

" Part 1: Industry; India's Renewable Power Journey\nIndia's Electricity Challenge: Mee ng Growing Dem"

In [7]:
from langchain_community.document_loaders import TextLoader

loader = TextLoader("XYZ_company_analysis.txt")
text_documents = loader.load()
text_documents

[Document(page_content=" Part 1: Industry; India's Renewable Power Journey\nIndia's Electricity Challenge: Mee ng Growing Demand\nDid you know that India is the world's third-largest electricity producer? In FY22, the country\nconsumed a whopping 1.7-petawa hours (PWh) (1,700 BU) of electricity! Most of this (86%) comes\nfrom u lity genera on, totalling 1.48 PWh. Despite this, India's per capita power consump on is s ll\nrela vely low at 1255 kWh. This is partly because the power industry's growth hasn't quite kept up\nwith the rapid increase in demand.\nHistoric Peak Power Demand\nGuess what? In May 2024, India's peak power demand hit a record 250 GW! Looking ahead, India's\npower demand is expected to grow by 70% by 2032 due to rising urbaniza on and increased demand\nfrom various sectors.\nPower Sector: Renewable vs Non-Renewable\nThe solar energy sector in India is booming! Rising demand for roo op installa ons and the use of\nsolar power in cap ve setups is driving this growth. It

In [9]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=20)
text_splitter.split_documents(text_documents)[:5]

[Document(page_content="Part 1: Industry; India's Renewable Power Journey", metadata={'source': 'XYZ_company_analysis.txt'}),
 Document(page_content="India's Electricity Challenge: Mee ng Growing Demand", metadata={'source': 'XYZ_company_analysis.txt'}),
 Document(page_content="Did you know that India is the world's third-largest electricity producer? In FY22, the country", metadata={'source': 'XYZ_company_analysis.txt'}),
 Document(page_content='consumed a whopping 1.7-petawa hours (PWh) (1,700 BU) of electricity! Most of this (86%) comes', metadata={'source': 'XYZ_company_analysis.txt'}),
 Document(page_content="from u lity genera on, totalling 1.48 PWh. Despite this, India's per capita power consump on is s", metadata={'source': 'XYZ_company_analysis.txt'})]

### Store the Splits in VectorStore

In [11]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=20)
documents = text_splitter.split_documents(text_documents)

In [12]:
from langchain_community.vectorstores import DocArrayInMemorySearch
from langchain_openai.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

comp_analysis_vector_store = DocArrayInMemorySearch.from_documents(documents, embeddings)

### Preparing Lang-chain

In [15]:
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

In [14]:
from langchain.prompts import ChatPromptTemplate

template = """
Answer the question based on the context below, and modify the language in as simple way as possible. I want the answer in below mentioned headers:
    - Summarize the Analysis and give 4-5 major bullet points
    - Describe the Industry of the company
    - Fundamental and Technical summary about the company
If you can't answer the question, reply "I don't know".

Context: {context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

In [18]:
chain = (
    {"context": comp_analysis_vector_store.as_retriever(), "question": RunnablePassthrough()}
    | prompt
    | model
    | parser
)
chain.invoke("Summarize the company for me, provide the output in readable and markdown format")

"### Summarize the Analysis\n- XYZ company has transitioned from being a top solar module manufacturer to a diverse renewable energy company.\n- The analysis of the company suggests growth and expansion into various sectors within the renewable energy industry.\n- The company's journey showcases its evolution and adaptability to changing market demands.\n- XYZ company's strategic decisions have positioned it as a significant player in the renewable energy sector.\n\n### Describe the Industry of the Company\nThe company operates in the renewable energy industry, with a focus on solar energy and other sustainable energy sources. With a transition from being primarily a solar module manufacturer, XYZ now offers a comprehensive range of renewable energy solutions.\n\n### Fundamental and Technical Summary about the Company\nXYZ company's fundamentals indicate a strong position in the renewable energy market, backed by strategic decisions and growth initiatives. The technical analysis sugges

### Summarize the Analysis\n- XYZ company has transitioned from being a top solar module manufacturer to a diverse renewable energy company.\n- The analysis of the company suggests growth and expansion into various sectors within the renewable energy industry.\n- The company's journey showcases its evolution and adaptability to changing market demands.\n- XYZ company's strategic decisions have positioned it as a significant player in the renewable energy sector.\n\n### Describe the Industry of the Company\nThe company operates in the renewable energy industry, with a focus on solar energy and other sustainable energy sources. With a transition from being primarily a solar module manufacturer, XYZ now offers a comprehensive range of renewable energy solutions.\n\n### Fundamental and Technical Summary about the Company\nXYZ company's fundamentals indicate a strong position in the renewable energy market, backed by strategic decisions and growth initiatives. The technical analysis suggests a positive outlook for the company's stock performance, reflecting its growth trajectory and market positioning."