# Clarity

Welcome to Clarity, the recipe for a perplexity.ai clone!

If you're looking for a way to improve your understanding and gain clarity on any topic, this repo has got you covered.

## What is Perplexity?

With perplexity.ai, users can input search queries and receive summaries of top search results including sources. However, with this recipe, you are shown how to achieve similar results in just five simple steps!

## The steps for reproduction

This repo shows how you can get similar results to perplexity.ai using the following steps:

1. Create search query from the input of the user
   - The user creates a query then submits it to ChatGPT for ChatGPT to make a query out of it.
2. Extract urls from the search query
   - The query is submitted to a search engine, in this case DuckDuckGo, which returns urls.
3. Extract the content from x amount of urls
   - Take some urls and read the innerText content of it cutoff at around 2-3k tokens so there is room for a ChatGPT response which maxes out at about 4k tokens.
4. Make a summary of the content
   - The response of the previous step is the summary of the query. And because it loops x amount of urls you get multiple summaries, knowing the sources.
5. _OPTIONAL_ Make a summary of all the summaries for the response
   - It can be helpful to have a summary of the summaries, but it is not recommended to get sources from this, as they might be scrambled or contaminated.

In [None]:
import puppeteer from "puppeteer";
import { Configuration, OpenAIApi } from 'openai';
import * as dotenv from 'dotenv'
dotenv.config()


In [None]:
const configuration = new Configuration({
    apiKey: process.env.API_KEY,
});
const openai = new OpenAIApi(configuration);

export async function chat(chatChain) {
    chatChain.push({"role": "system", "content": 'You are a good bot.'});
    chatChain.reverse();
    const completion = await openai.createChatCompletion({
        model: "gpt-3.5-turbo",
        messages: chatChain,
        temperature: 0.3,
    });

    return completion.data.choices[0].message.content.trim();
}

In [None]:

export async function getContentFromHeadlessBrowser(url) {
    const browser = await puppeteer.launch({ headless: true });
    const page = await browser.newPage();
    await page.goto(url);

    const contentText = await page.evaluate(() => {
        const element = document.body;
        return element.innerText;
      });

    await browser.close();
    return contentText;
}

export async function getUrlsFromDuckDuckGo(url) {
    /* encode uri/url whatever it is */
    url = encodeURI(url);

    console.log(url)
    const browser = await puppeteer.launch({ headless: true });
    const page = await browser.newPage();
    await page.goto(url);
    const textSelector = await page.waitForSelector(
        '.result__url'
    );
    const urls = await page.$$eval('.result__url', as => as.map(a => a.innerText));

    await browser.close();
    return urls;
}

In [None]:
let history = []; // very important

async function ask(input) {
    // step 1 Create search query from the input of the user
    let searchTerm = await chat([{ content: `You are a search query maker. Give back a search query for this: ${input} }`, role: 'user' }, { content: `Only give back a short search query for all my questions`, role: 'user' }].concat(history));
    searchTerm = searchTerm.replaceAll('"', '');

    // step 2 Extract urls from the search query
    let urls = await getUrlsFromDuckDuckGo(`https://html.duckduckgo.com/html/?q=${searchTerm}`);

    urls.forEach(url => {
        url.trim();
    });

    // step 3 Extract the content from x amount of urls
    let answers = [];

    for (let i = 0; i < 3; i++) {
        const url = urls[i];
        let webContent = await getContentFromHeadlessBrowser(`https://${url}`);
        // step 4 Make a summary of the content
        answers.push(await chat([{ content: `Answer this question: ${input}\r\nWith a very detailed summary:\r\n${webContent.substring(0, 2048)}`, role: 'user' }].concat(history)) + ` source: *[${urls[i]}]*`);
    }

    // optional: step 5 Make a summary of all the summaries for the response
    // let response = await chat([{ content: `Answer this question: ${input}\r\nUse the following information for the original question: ${answers}`, role: 'user' }].concat(history));

    return answers.join(' ');
}

In [None]:
let input = 'What is the latest news on coronavirus?';
let response = await ask(input);
console.log(`Your answer on input:\r\n${response}`);
history.push({ content: input, role: 'user' });
history.push({ content: response, role: 'assistant' });

let input2 = 'How many deaths?'; // it knows context from history
let response2 = await ask(input2);
console.log(`Your answer on input2:\r\n${response2}`);
history.push({ content: input2, role: 'user' });
history.push({ content: response2, role: 'assistant' });