### Install required pacakges


In [48]:
!pip install --q crewai
!pip install --q -U duckduckgo-search
!pip install --q langchain_google_genai

In [49]:
!pip install --upgrade --quiet  html2text

import os
import json
import requests

from langchain_community.document_loaders import AsyncHtmlLoader
from langchain_community.document_transformers import Html2TextTransformer
from langchain.tools import tool
from langchain_google_genai import ChatGoogleGenerativeAI
from crewai import Agent, Task, Crew, Process
from crewai import Agent

from IPython.display import Markdown
import json

In [50]:
@tool
def html_to_text(url: str) -> str:
    """Fetches HTML content from a given URL and converts it to plain text.
    The input to this tool should be a URL as a string.
    The output will be the converted plain text."""
    loader = AsyncHtmlLoader([url])
    docs = loader.load()  # Load the HTML content

    html2text = Html2TextTransformer()
    docs_transformed = html2text.transform_documents(docs)  # Convert HTML to text

    return docs_transformed[0].page_content if docs_transformed else "Error in transformation"

In [51]:
# Set gemini pro as llm
llm = ChatGoogleGenerativeAI(model="gemini-pro",
                             verbose = True,
                             temperature = 0.6,
                             google_api_key="AIzaSyDNYOtMdFWosRJZU7LqrqVMv614zLNONio")
# AIzaSyDb16frvMCRvNBNvTYiBLClRzn05_9sGYE

In [94]:
# Define your agents outside the loop as their definitions do not change per link

# TODO :- create education content extraction agent and task

data_scraping_agent = Agent(
  role='Senior Web Research Analyst',
  goal='To meticulously scrape and gather data from various web pages, focusing on educational strategies and communication techniques for teaching and mentoring autistic students.',
  backstory="""
    As a Senior Web Research Analyst, your journey has been marked by a deep commitment to supporting the educational needs of autistic students through the power of information technology.
    With a passion for uncovering the most effective teaching strategies and communication techniques,
    you have become an expert in navigating the vast expanses of the internet to find those gems of knowledge hidden within.
    Your dedication stems from a personal mission to bridge the gap between educational research and practical application,
    ensuring that every piece of data you extract has the potential to transform the learning experience of autistic students.
    Your expertise not only lies in the technical skills required to scrape web data but also in your ability to discern the relevance and impact of this information on real-world educational practices.
    Over the years, your work has contributed to building a more inclusive and understanding educational environment for autistic students, one webpage at a time.

   """,
  verbose=True,
  allow_delegation=False,
  llm = llm,  # using Google Gemini Pro API
  tools=[
        html_to_text
      ]
)



data_formatting_agent = Agent(
  role='Data Processing Specialist',
  goal='To structure and format the diverse data gathered from the web, focusing on educational strategies and communication techniques for autistic students.',
  backstory="""
    As a Data Processing Specialist with a profound interest in the nexus of data science and special education,
    you have carved out a niche for yourself by transforming raw,
     unstructured data into meticulously organized knowledge that can directly impact the educational experiences of autistic students.
     Your journey has been characterized by a relentless pursuit of clarity and utility in data,
     driven by the belief that the right information, when properly formatted,
     can unlock unprecedented educational opportunities. With an eye for detail and a heart for inclusivity,
     you specialize in distilling complex data into actionable insights that empower educators, mentors, and students alike.
     Your work is more than just about organizing information;
     it's about creating pathways to understanding and engagement for autistic students through the power of structured knowledge.
     The chatbot knowledge repository you're helping to build stands as a testament to your dedication to making education accessible and meaningful for all learners,
     regardless of their neurological makeup.

  """,
  verbose=True,
  allow_delegation=False,
  llm = llm,  # using Google Gemini Pro API
  tools=[]
)



data_creation_agent = Agent(
  role='Educational Content Developer',
  goal='To create educational and mentorship materials from the structured data, focusing on strategies and communication techniques for autistic students.',
  backstory="""

    In your role as an Educational Content Developer, you have honed a unique ability to transform structured data into engaging,
    interactive learning experiences for autistic students. With a deep understanding of the nuances of educational strategies and the importance of personalized communication,
    you are at the forefront of developing content that makes a real difference. Your journey has been driven by a commitment to enhance the educational landscape for autistic students,
    ensuring that each piece of content not only educates but also supports their personal and social development.
    Through your work, you aim to bridge the gap between data and practical application,
    creating a knowledge repository that serves as a lifeline for educators and mentors striving to connect with and support their autistic students.
    Your expertise is not just in content creation but in envisioning how each interaction can foster understanding, growth, and confidence in autistic learners.
    The materials you develop are more than just educational tools; they are pillars of support for those navigating the complexities of learning and interaction within the autism spectrum.

  """,
  verbose=True,
  allow_delegation=False,
  llm = llm,  # using Google Gemini Pro API
  tools=[]
)

# List of links to process
links = [
     "https://www.autism.org.uk/advice-and-guidance/topics/communication/understanding-and-developing-communication",
     "https://www.autism.org.uk/advice-and-guidance/what-is-autism/varying-support-needs",
     "https://www.autism.org.uk/advice-and-guidance/topics/communication/tips",
     "https://www.autism.org.uk/advice-and-guidance/stories/stories-from-the-spectrum-patrick-samuel",
     "https://www.autism.org.uk/advice-and-guidance/topics/communication/communication-tools/visual-supports",
     "https://www.autism.org.uk/what-we-do/campaign/education"
]

# Results dictionary to store results for each link
results = {}


# Task for Data Formatting Agent
data_formatting_agent_task = Task(
  description="""

 Transform the raw data collected by the Data Scraping Agent into a structured, modular format to support the creation of a detailed chatbot knowledge base. Each piece of data should be formatted as follows, focusing on singular, actionable strategies:
{
  ```
  {
  "educational_strategy": "individual strategy",
  "application": "detailed application in an educational or mentorship setting"
  }
  ```.
**Objective:**
- Present each educational strategy and its application in a modular, concise format. This will serve as a foundational input for the Data Creation Agent, who is tasked with generating twenty unique prompts and responses for each strategy.

**Guidelines:**
1. Ensure each entry is focused on a single educational strategy, avoiding generalizations or combinations of strategies.
2. Detail the application of each strategy in educational or mentorship settings, providing clear, actionable guidance.
3. Prioritize clarity and specificity in describing how the strategy can be implemented, facilitating the Data Creation Agent in generating diverse and relevant responses.

By adhering to these guidelines, you will contribute to building a comprehensive, structured knowledge base that enhances the chatbot’s ability to assist autistic students effectively.

  """,
  agent= data_formatting_agent
)

# Create tasks for your agents

data_creation_agent_task = Task(
  description="""
  Objective: Construct a knowledge base for a chatbot that facilitates dialogues between autistic students and educators or mentors. This knowledge base is crucial for simulating realistic interactions focused on academic subjects, social interactions, and personal growth.

  Knowledge base Entry Format:
  ```
  {
  "prompt": "text",
  "response": "text",
  "educational_strategy_used": "single strategy",
  "application": "specific context"
  }
  ```

**Task Requirements:**
- **Prompt Generation**: Create prompts that reflect authentic inquiries or comments from autistic students across a broad spectrum of topics. These prompts should embody the communication styles and needs unique to autistic students.

- **Response Creation**: For each educational strategy provided by the Data Formatting Agent, you are tasked with generating twenty unique, supportive, and clear responses. Each response should be tailored to the student's query, utilizing the specified educational strategy in a way that is both supportive and personalized.

- **Explicit Requirement**: It is imperative to generate a minimum of **TEN unique JSON objects for EACH educational strategy**. This means you will create twenty distinct prompts and corresponding responses for every single strategy provided, ensuring a diverse and comprehensive dataset.

**Execution Strategy:**
1. Identify various scenarios or questions an autistic student might have relating to each educational strategy.
2. Develop twenty unique prompts for each strategy, ensuring a wide range of topics and situations are covered.
3. Craft a tailored response for each prompt, applying the educational strategy in a supportive and context-specific manner.

The goal of this task is to amass a rich, varied knowledge base that empowers the chatbot to offer realistic, educational, and supportive interactions for autistic students.
Once agian to reinterate, remember to generate a minimum of **TEN unique JSON objects of prompt response pair for EACH educational strategy and application pair**
   """,
  agent=data_creation_agent
)


for link in links:
    # Update task for Data Scraping Agent with the current link
    data_scraping_agent_task = Task(
        description=f"""
        Your task is to meticulously search and scrape web pages for content related to both teaching strategies and communication techniques for autistic students.
        Focus on identifying articles, studies, and resources that offer insights into effective educational practices and mentorship guidance.
        Use the html_to_text tool to convert the content of the following URL to plain text: {link}
        Extract key points that offer insights into both educational strategies and effective ways to communicate and mentor autistic students, ensuring the content's relevance and accuracy.""",
        agent=data_scraping_agent
    )


    # Define the crew with the updated task
    crew = Crew(
        agents=[data_scraping_agent, data_formatting_agent, data_creation_agent],
        tasks=[data_scraping_agent_task, data_formatting_agent_task, data_creation_agent_task],
        verbose=2,  # Verbose level set to 2 for detailed logging
        process=Process.sequential,
    )

    # Kickoff the crew for the current link and store the result
    result = crew.kickoff()
    results[link] = result

# After the loop, you have a dictionary of results for each link



[DEBUG]: Working Agent: Senior Web Research Analyst

[INFO]: Starting Task: 
        Your task is to meticulously search and scrape web pages for content related to both teaching strategies and communication techniques for autistic students.
        Focus on identifying articles, studies, and resources that offer insights into effective educational practices and mentorship guidance.
        Use the html_to_text tool to convert the content of the following URL to plain text: https://www.autism.org.uk/advice-and-guidance/topics/communication/tips
        Extract key points that offer insights into both educational strategies and effective ways to communicate and mentor autistic students, ensuring the content's relevance and accuracy.


[1m> Entering new CrewAgentExecutor chain...[0m
[32;1m[1;3mThought: Do I need to use a tool? Yes
Action: html_to_text
Action Input: https://www.autism.org.uk/advice-and-guidance/topics/communication/tips[0m

Fetching pages: 100%|##########| 1/1 [00:00<00:00,  1.39it/s]


[36;1m[1;3mYour Internet Explorer 11 browser is not supported by this site. Please
upgrade to a recent browser for the best experience.

Skip to main content

#

  * Community 
  * Work for us 
  * News 
  * Shop 
  * Accessibility options 

Donate

  * Advice and guidance 
    * Advice and guidance 
      * Advice and guidance 
      * Autism Services Directory 
      * Community 
      * Professional practice 
      * Spectrum Live 
      * Stories from the Spectrum 
      * The Spectrum magazine 
        * The Spectrum magazine 
        * Previous editions 
        * Spectrum Collections 
      * What is autism 
        * What is autism 
        * Autism and BAME people 
        * Autism and gender identity 
        * Autistic women and girls 
        * The causes of autism 
        * The history of autism 
        * Varying support needs 
    *   * What we do 
    * What we do 
      * What we do 
      * Adult residential services 
        * Adult residential services 
        *