# Git+Metaphor: A tool for finding useful tools
Github contains a treasure trove of hidden or often unused useful tools that people rarely stumble upon, as people rely solely on programs that are not open-source or are very popular.

This tool uses Langchain to create an LLM who's primary purpose is to prescribe github repositories to people depending on the problem they are trying to solve or the question that ask. 

You must include txt files in the same directory as the notebook containing your OPENAI API key, your OPENAI Organization key, and your Metaphor API key. 

Also make sure you have the necessary pip installs including: 

* pip install pydantic
* pip install langchain
* pip install metaphor-python

In [13]:
### RUN THIS CELL
from langchain.chains import LLMChain
from langchain.tools import StructuredTool
from langchain.prompts import MessagesPlaceholder
from langchain.agents import AgentType, initialize_agent, load_tools, Tool
from langchain import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.memory import ConversationBufferMemory
from metaphor_python import Metaphor
from pydantic import BaseModel, Field


In [15]:
### RUN THIS ONE TOO
def search(text, start_published):
    domain = "https://github.com/"
    if (text == ""):
        return "Search Failed: Empty search query"
    with open("METAPHORAPI.txt", "r") as text_file:
        metaphorAPI=str(text_file.read())
    metaphor = Metaphor(metaphorAPI)
    strBuilder = ""
    if (domain == ""):
        if (start_published == ""):
            search_resp = metaphor.search(text)
        else:
            search_resp = metaphor.search(text, start_published_date = start_published)
    else:
        if (start_published == ""):
            d = [domain, ]  
            search_resp = metaphor.search(text, include_domains = d)
        else:
            d = [domain, ]  
            search_resp = metaphor.search(text, include_domains = d, start_published_date = start_published)
    contents_resp = search_resp.get_contents()
    for content in contents_resp.contents:
        strBuilder+=f"title: {content.title}, Url: {content.url}\n"
    return strBuilder


In [16]:
### RUN THIS CELL
class Smart_Tool_Agent:
    def __init__(self):
        agent_kwargs = {
        "extra_prompt_messages": [MessagesPlaceholder(variable_name="memory")],
        }
        memory = ConversationBufferMemory(memory_key="memory", return_messages=True)
        with open("OPENAIAPIKEY.txt", "r") as text_file:
            openaiAPI=str(text_file.read())
        with open("OPENAIORGKEY.txt", "r") as txt:
            openaiOrg=str(txt.read())
        llm = ChatOpenAI(openai_api_key=openaiAPI, openai_organization=openaiOrg, temperature=0.2, model="gpt-3.5-turbo-0613")
        disc ="A search tool that allows you to access github and get links to repositories and their titles. Useful to provide users with links to interesting or useful Git repositories based on a user's request."
        class searchSchema(BaseModel):
            text: str = Field(description="Should be a search query to access the web in the format of how people describe a link on the Internet. For example, 'best restaurants in SF' is a bad query, whereas 'Here is the best restaurant in SF:' is a good query.")
            start_published: str = Field(description= "A date filter for only producing results past a certain date. Dates should be a string in the format 'YYYY-MM-DD'.")

        tool = StructuredTool.from_function(name="searchTool", func=search, description=disc, args_schema=searchSchema)
        tools=[tool,]
        self.agent=initialize_agent(tools, llm, agent=AgentType.OPENAI_FUNCTIONS, agent_kwargs=agent_kwargs, memory=memory, max_iterations=10)
    def get_response(self, promptMessage):
        return self.agent.run("You are a LLM that can access github repositories using a 'search' tool. You always provide links to github repositories in response to any question a user has in order to solve that user's question or issue. You can then also provide explanation and answer the user, but you must always search github and return links to the user. Here is the user's message:\n" + promptMessage)
    


In [17]:
## EXAMPLE OF HOW TO USE THE LLM TOOL:
new_agent=Smart_Tool_Agent()
user_input = "Find me a way to scrapes the web for news" 
print(new_agent.get_response(user_input))

I found some GitHub repositories related to web scraping news. Here are the links:

1. [Python3Spiders/AllNewsSpider](https://github.com/Python3Spiders/AllNewsSpider): This repository aims to scrape news from various news portals. Please note that the data obtained should not be used for commercial purposes.

2. [riad-azz/bbc-news-scraper](https://github.com/riad-azz/bbc-news-scraper): This repository allows you to scrape news articles from the BBC using Google search.

3. [gabrielkheisa/news-scrapper](https://github.com/gabrielkheisa/news-scrapper): This repository retrieves the four newest news articles from a news portal.

4. [houstonharwood/my-data-robot](https://github.com/houstonharwood/my-data-robot): This repository automatically scrapes data from websites for news reporting purposes.

5. [mugilanD98/GNscraper](https://github.com/mugilanD98/GNscraper): This repository is a news scraper.

6. [allanvini/Python-web-scraping](https://github.com/allanvini/Python-web-scraping): This 

In [None]:
### INTERACTIVE VERSION
new_agent=Smart_Tool_Agent()
while user_input != "q":
    user_input = input("Enter something (or 'q' to quit): ")
print(new_agent.get_response(user_input))