# Study Plan generation using LangChain 🦜🔗
- Setup: please run requirements.txt file to make sure the dependencies are installed.
- Environment variables: create a .env file and place your API_KEY inside it.

[View Notebook on GitHub](https://github.com/fisa712/cogentlabs/blob/main/UntitledCogent.ipynb)


[![Run in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yourusername/yourrepository/blob/main/path/to/your_notebook.ipynb)

Please note: I have a limited subscription to my openai key so i only generated summary of few pages instead.

In [1]:
from langchain.prompts import PromptTemplate
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableParallel
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
import os


In [2]:

load_dotenv()

openai_key = os.getenv('OPENAI_API_KEY')


model = ChatOpenAI(api_key = openai_key, model='gpt-4')
template = """
You are an expert educational consultant specializing in creating personalized study plans for students. Your task is to generate a customized study plan for the student based on the provided information and the template. The study plan should address academic requirements, preferred learning styles, personal objectives, challenges, and extracurricular activities.

### Student Information:
  - Name: {name}
  - Field of Study: {field_of_study}
  - Year of Study: {year_of_study}
  - List of Subjects: {list_of_subjects}
  - Preferred Learning Styles: {preferred_learning_styles}
  - Personal Objectives: {personal_objectives}
  - Challenges: {challenges}
  - Extracurricular Activities: {extracurricular_activities}

### Study Plan Template:
  -> Week 1-4: Initial Phase
  - Academic Goals:
    <provide a list of goals to achieve in the coming month on the basis of previous academic record of student.>

  -> Weekly Schedule:
  - Monday to Friday:
    **Instruction**: To create a weekly schedule analyze data of student and list the names of courses that needs to be improved in terms of more practice, review and study.
    <time slot>: <course name: motivation why this course should be prioritized>
    <time slot>: <break> (if required between the courses for a healthy balance)
    ...
  - Weekends:
    **Instruction**: Analyze data and decide how a student should be spending his time in extracurricular and kinesthetic activities according to the students details provided.
    <duration> hours of <activity>
    ...
  - Resources:
    **Instruction**: Provide a list of resources best suitable for the students abilities and previous performances.
    <resource name> : <type>, <discuss what it will cover>
    ...
  - Check-ins:
    **Instruction**: Provide a meeting schedule to meet the improvements and deficiencies.
    <meeting name> <meeting time slot>: <meeting reason, objectives>
    ...
  - Personal Goals:

    **Instruction**: List any exercise or any hobby for student to craete a worklife balance by looking at its data.
    <technique/ exercise>: <motivation and its impact>

Remember to make the study plan as detailed and actionable as possible, catering to the unique needs and aspirations of the student. Moreover refrain from giving any side notes except the study plan.
"""

prompt = PromptTemplate(template=template, input_variables=[
    "name",
    "field_of_study",
    "year_of_study",
    "list_of_subjects",
    "preferred_learning_styles",
    "personal_objectives",
    "challenges",
    "extracurricular_activities"
])
# Example input
input_data = {
    "name": "Yousaf Hashmi",
    "field_of_study": "Computer Science",
    "year_of_study": "2nd Year",
    "list_of_subjects": "Algorithms, Data Structures, Operating Systems, Databases",
    "preferred_learning_styles": "visual, auditory",
    "personal_objectives": "prepare for upcoming exams, understand algorithms better",
    "challenges": "difficulty in understanding complex algorithms",
    "extracurricular_activities": "chess club, coding competitions"
}

chain =  prompt | model
result = chain.invoke(input_data)


In [2]:
print(result.content)

### Study Plan for Yousaf Hashmi

-> Week 1-4: Initial Phase
  - Academic Goals:
    1. Improve understanding and application of algorithms
    2. Prepare for upcoming exams in all subjects
    3. Improve competency in Data Structures, Operating Systems, and Databases

-> Weekly Schedule:
  - Monday to Friday:
    8:00 - 10:00: Algorithms - As Yousaf has mentioned difficulty in understanding complex algorithms, this subject should be prioritized.
    10:00 - 10:30: Break
    10:30 - 12:30: Data Structures - This is key to understanding algorithms better, hence it should be studied after algorithms.
    12:30 - 1:30: Break
    1:30 - 3:30: Operating Systems - This subject is fundamental to the field of Computer Science.
    3:30 - 4:00: Break
    4:00 - 6:00: Databases - This subject will complete knowledge base essential for coding competitions.

  - Weekends:
    2 hours of Chess Club - This will help Yousaf relax and also improve strategic thinking which is beneficial for algorithm u

# Task: Summary Generation of the book


- I designed my custom approach to create the summary document.
- With LangChain, the map_reduce chain breaks the document down into 1024 token chunks max.

In [None]:
# %pip install --upgrade --quiet langchain-google-community[drive]

In [3]:
from langchain_google_community import GoogleDriveLoader

In [7]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("./books/crime-and-punishment.pdf", )
docs = loader.load()

In [9]:
len(docs), type(docs)

(767, list)

In [21]:
from typing import Any
from langchain.prompts import PromptTemplate

from langchain_core.runnables import (
    ConfigurableField,
    Runnable,
    RunnableLambda,
    RunnableParallel,
    RunnablePassthrough,
)
step = 20

prompt_template = """Write a concise 1 page summary of the following:
"{book_section}"
ONE PAGE SUMMARY:"""
summ_prompt = PromptTemplate.from_template(prompt_template)


get_sections: Runnable[Any, Any] = (
    RunnablePassthrough() | 
    RunnableLambda(
        lambda docs: [
            {"book_section": "\n".join(doc.page_content.strip('Free eBooks at Planet eBook.com') for doc in docs[i:i+step])}
            for i in range(0, len(docs), step)
        ]
    )
)
search_query = summ_prompt | ChatOpenAI(model='gpt-4-0125-preview', temperature=0, openai_api_key = openai_key) #| StrOutputParser() | load_json

def add_paragraphs(text):
    para = text.content
    #  = text.get('summary')
    return para



summarize: Runnable[Any, Any] = (
    RunnableParallel(
        {
            "summary": summ_prompt | model #| RunnableLambda(add_paragraphs)
        }
    ) 
)
chain = get_sections | summarize.map()
summaries = chain.invoke(docs)
summaries.content

[{'summary': AIMessage(content='"Crime and Punishment" by Fyodor Dostoevsky is a classic piece of literature available for free on Planet eBook. The book begins with an introduction to Dostoevsky, a celebrated Russian author who lived a life of poverty, hard labor, and suffering due to his involvement in revolutionary activities. His experiences profoundly impacted his work, including this novel. \n\nThe story starts with a young man named Raskolnikov, who lives in a decrepit garret in Petersburg. He is in debt and suffers from mental health issues, making him feel isolated from society. Despite his condition, he is not bothered by his shabby appearance, but he fears interacting with others, especially his landlady to whom he owes money.\n\nRaskolnikov contemplates a \'project\' he is planning, which he rehearses by visiting the flat of an old woman. The woman, who lives in a small, clean room with minimal furniture, is wary of him. Raskolnikov\'s project is left ambiguous, building su

In [26]:
summaries[0]['summary'].content

'"Crime and Punishment" by Fyodor Dostoevsky is a classic piece of literature available for free on Planet eBook. The book begins with an introduction to Dostoevsky, a celebrated Russian author who lived a life of poverty, hard labor, and suffering due to his involvement in revolutionary activities. His experiences profoundly impacted his work, including this novel. \n\nThe story starts with a young man named Raskolnikov, who lives in a decrepit garret in Petersburg. He is in debt and suffers from mental health issues, making him feel isolated from society. Despite his condition, he is not bothered by his shabby appearance, but he fears interacting with others, especially his landlady to whom he owes money.\n\nRaskolnikov contemplates a \'project\' he is planning, which he rehearses by visiting the flat of an old woman. The woman, who lives in a small, clean room with minimal furniture, is wary of him. Raskolnikov\'s project is left ambiguous, building suspense for the reader.\n\nThis 

In [29]:
from reportlab.lib.pagesizes import letter
from reportlab.pdfgen import canvas

class PDFCreator:
    def __init__(self, output_path):
        self.output_path = output_path
        self.paragraphs = []

    def add_paragraph(self, paragraph, last=False):
        self.paragraphs.append(paragraph)
        if last:
            self.create_pdf()

    def create_pdf(self):
        c = canvas.Canvas(self.output_path, pagesize=letter)
        width, height = letter
        text = c.beginText(40, height - 40)
        text.setFont("Helvetica", 12)

        for paragraph in self.paragraphs:
            text.textLines(paragraph)
            text.moveCursor(0, 14)  # Line spacing

        c.drawText(text)
        c.showPage()
        c.save()

if __name__ =='__main__':

    summaries = chain.invoke(docs[:10])
    pdf_creator = PDFCreator("./summary_docs/output.pdf")
    # Add summaries to PDF
    for summary in summaries:
        pdf_creator.add_paragraph(summary['summary'].content)

    # Call the last add_paragraph with last=True to create the PDF
    pdf_creator.add_paragraph("", last=True)
    print('Summary generated successfully.')

Summary generated successfully.


## Another Approach to summarize a book using mapreduce chain (langchain)

In [31]:
pdf_file = 'books/crime-and-punishment.pdf'
pdf_loader = PyPDFLoader(pdf_file)
pages = pdf_loader.load_and_split()
print(pages[3].page_content)


Crime and Punishment epilepsy, from which he suffered for the rest of his life. The 
fits occurred three or four times a year and were more fre -
quent in periods of great strain. In 1859 he was allowed to 
return to Russia. He started a journal— ‘Vremya,’ which was 
forbidden by the Censorship through a misunderstanding. 
In 1864 he lost his first wife and his brother Mihail. He was 
in terrible poverty, yet he took upon himself the payment 
of his brother’s debts. He started another journal—‘The Ep -
och,’ which within a few months was also prohibited. He 
was weighed down by debt, his brother’s family was depen -
dent on him, he was forced to write at heart-breaking speed, 
and is said never to have corrected his work. The later years 
of his life were much softened by the tenderness and devo -
tion of his second wife.
In June 1880 he made his famous speech at the unveil -
ing of the monument to Pushkin in Moscow and he was 
received with extraordinary demonstrations of love and 
h

#### MapReduce
The MapReduce method is a multi-stage approach for summarizing large texts. It works by first summarizing smaller chunks of text and then combining those summaries into a single comprehensive summary.

In LangChain, this method is implemented using the MapReduceDocumentsChain. To use this, set map_reduce as the chain_type in the load_summarize_chain method.


In [42]:
from langchain.chains.summarize import load_summarize_chain
import pandas as pd
map_prompt_template = """
                      Write a summary of this chunk of text that includes the main points and any important details.
                      {text}
                      """

map_prompt = PromptTemplate(template=map_prompt_template, input_variables=["text"])

combine_prompt_template = """
                      Write a concise summary of the following text delimited by triple backquotes.
                      Return your response in paragraphs which covers the important points of the text.
                      ```{text}```
                      BULLET POINT SUMMARY:
                      """

combine_prompt = PromptTemplate(
    template=combine_prompt_template, input_variables=["text"]
)
map_reduce_chain = load_summarize_chain(
    model,
    chain_type="map_reduce",
    map_prompt=map_prompt,
    combine_prompt=combine_prompt,
    return_intermediate_steps=True,
)
map_reduce_outputs = map_reduce_chain({"input_documents": pages[:30]})


In [43]:
summary = map_reduce_outputs['output_text']


In [44]:
if __name__ =='__main__':
    pdf_creator = PDFCreator("./summary_docs/map_reduce_output2.pdf")
    pdf_creator.add_paragraph(summary)

    # Call the last add_paragraph with last=True to create the PDF
    pdf_creator.add_paragraph("", last=True)
    print('Summary generated successfully.')

Summary generated successfully.
