# E2B Data Analysis

[E2B's cloud environments](https://e2b.dev) are great runtime sandboxes for LLMs.

E2B's Data Analysis sandbox allows for safe code execution in a sandboxed environment. This is ideal for building tools such as code interpreters, or Advanced Data Analysis like in ChatGPT.

E2B Data Analysis sandbox allows you to:
- Run Python code
- Generate charts via matplotlib
- Install Python packages dynamically durint runtime
- Install system packages dynamically during runtime
- Run shell commands
- Upload and download files

We'll create a simple OpenAI agent that will use E2B's Data Analysis sandbox to perform analysis on a uploaded files using Python.

Get your OpenAI API key and [E2B API key here](https://e2b.dev/docs/getting-started/api-key) and set them as environment variables.


In [None]:
import os
os.environ["E2B_API_KEY"] = "<E2B_API_KEY>"
os.environ["OPENAI_API_KEY"] = "<OPENAI_API_KEY>"

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.tools import E2BDataAnalysisTool
from langchain.agents import initialize_agent, AgentType

e2b_data_analysis_tool = E2BDataAnalysisTool()

Upload an example CSV data file to the sandbox so we can analyze it with our agent. You can use for example [this file](https://storage.googleapis.com/e2b-examples/netflix.csv) about Netflix tv shows.

In [None]:
with open("./netflix.csv") as f:
  e2b_data_analysis_tool.upload_file(
    file=f,
    description="Data about Netflix tv shows including their title, category, director, release date, casting, age rating, etc.",
  )

Create a `Tool` object and initialize the Langchain agent.

In [None]:


tools = [e2b_data_analysis_tool.as_tool()]

llm = ChatOpenAI(model="gpt-4", temperature=0)
agent = initialize_agent(
    tools, llm, agent=AgentType.OPENAI_FUNCTIONS, verbose=True, handle_parsing_errors=True
)

Now we can ask the agent questions about the CSV file we uploaded earlier.

In [None]:
agent.run("What are the 5 longest movies on netflix released between 2000 and 2010?")

E2B also allows you to install both Python and system (via `apt`) packages dynamically during runtime like this:

In [None]:
# Install Python package
e2b_data_analysis_tool.install_python_packages('pandas')

# Install system package
e2b_data_analysis_tool.install_system_packages('ffmpeg')

Additionally, you can download any file from the sandbox like this:

In [None]:
# The path is a remote path in the sandbox
files_in_bytes = e2b_data_analysis_tool.download_file('/home/user/file')

Lastly, you can run any shell command inside the sandbox

In [None]:
# Install SQLite
output = e2b_data_analysis_tool.run_command("sudo apt update && sudo apt install sqlite3")
print(output["stdout"])
print(output["stderr"])
print(output["exit_code"])

# Check the SQLite version
output = e2b_data_analysis_tool.run_command("sqlite3 --version")
print(output["stdout"])
print(output["stderr"])
print(output["exit_code"])

When your agent is finished, don't forget to close the sandbox

In [None]:
e2b_data_analysis_tool.close()