# LLM Example: Detect Throwaway mail domains

Find out whether a mail domain is from a one-time mail provider. These providers are often operating under different names and creating and updating a blacklist is tedious. This example uses a llm agent to search with duckduckgo for the mail provider and give a statement wheter a domain is a throwaway address.

> Looking for someone to kickstart your Generative AI project? \
  Write me a message (jannis.gansen@codecamp-n.com) and check out https://www.codecamp-n.com/  \
  ![CodeCampN.png](../logo.png)

## 1. ‚õìÔ∏è Install dependencies

- We'll use langchain for creating our agent
- openai for accessing the OpenAI models
- duckduckgo-search for querying the mail provider (as this doesn't require an account or key)


In [None]:
!pip install langchain openai duckduckgo-search

## 2. üîë Setup credentials

You have two options to set your credentials:



### Option 1: Set environment directly in the notebook

Insert your OpenAPI Key into the OPENAI_API_KEY environment variable, but make sure to not share this information:

In [None]:
import os

os.environ["OPENAI_API_KEY"] = "** Your Key **"

### Option 2: Query via getpass

This requries the user of the notebook to provide the key.

Pro: You can't lose your key \
Con: You need to enter it everytime you restart the kernel.

In [None]:
from getpass import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass('Your OPENAI key: ')

## 3. üõ†Ô∏è Setup your agent

For this example we'll use a GPT-3.5-turbo chatmodel and a Search Engine tool.
Langchain provides tools for this.

We'll use
- the ChatOpenAI model from langchain to access OpenAI Models with our agent
- the DuckDuckGoSearchRun tool from langchain to provide a tool for looking up domains

The following block is setting up the tools. Notice how we tell our model *what* it can actually do with each tool.


In [59]:
from langchain import OpenAI
from langchain.tools import DuckDuckGoSearchRun
from langchain.agents import Tool
from langchain.chat_models import ChatOpenAI


llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613")
search = DuckDuckGoSearchRun()

tools = [
    Tool(
        name = "Search",
        func=search.run,
        description="""
Useful for when you need to answer questions about a domain. You can search for the domain name and 'no account required'
"""
    )
]


Next we create an agent that will perform a task for us.
Providing the tools allow our the agent to search for entries.

We'll use OPENAI_FUNCTIONS as the agent type, as the model is trained for working with tools. Check out the [Agent Types](https://python.langchain.com/docs/modules/agents/agent_types/) in the langchain documentation and feel free to try others.

In [60]:
from langchain.agents import initialize_agent
from langchain.agents import AgentType

agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)

## 4. üïµÔ∏è‚Äç‚ôÄÔ∏è Run your agent

Executing your agent is simple now, we first create a prompt template (beware that this is isn't preventing prompt injection!)

In [75]:
from langchain import PromptTemplate


template = """\
You are responsible for finding out whether an users domain is allowed to \
  access your service.

Only domains that are not from one-time, temporary, disposable or throw away mail providers are allowd
Is the domain from the mail {mail} allowed?

Explicitly search wether you can send mails without an account from this service.
If it is not explicitly promoted as disposable, anonymous, etc. the service should be allowed.

Return the answer as a json with the boolean field "allowed".
Only return JSON.
"""

prompt = PromptTemplate.from_template(template)

...and finally we can execute the agent using the prompt

This should not be allowed:

In [None]:
agent.run(prompt.format(mail="throwawaymail.com"))

...while this should be a reputable source.

In [None]:
agent.run(prompt.format(mail="gmail.com"))

## üöÄ Where do we go from here?

### Improve the prompt

You may notice that the response isn't always json.
Can you figure out how to fix it?
Ideas:
- Tune the prompt
- Createa a second call to the LLM that converts the response to a json.

Also check out the [Output parsers](https://python.langchain.com/docs/modules/model_io/output_parsers/) from langchain!

### Contact us!

Check out codecamp-n.com and feel free to drop me a message via
 
- ‚úâÔ∏è Mail: jannis.gansen@codecamp-n.com
- or LinkedIn https://www.linkedin.com/in/jannis-gansen/