# Task Definition: Spider Crawler Integration without Agents

## Overview

This task is designed to generate a simple Spider Crawler and integarte it with the worflow to scrape the data from the web.

### Task Flow

1. **Input**: The user provides url to crawl."

2. **Spider Tool ntegration**: 
   - Create a spider tool that can crawl the web and extract the data from the given url.

3. **Output**: The final output is the extracted data from the given url.

This task leverages the Spider Crawler tool to extract the data from the given url.

To recreate the notebook, a link to the Colab notebook is provided below. The notebook contains the code implementation for the task.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://drive.google.com/file/d/1JLXPn-MWxA-4rVABuZ0ZvIcdh6AQVnRf/view?usp=sharing)

For more details on the task or any questions, please feel free to reach out to the author, [Julep AI](hey@julep.ai).

Installing the Julep Client

In [10]:
!pip install julep -U --quiet

#### NOTE:

The uuid is generated for the agent and task, and is used to identify the agent and task in the system. Once the agent and task are created, the uuid should not be changed for simplicity.

Changing the uuid will cause the agent and task to be treated as a new agent and task respectively. The previous agent and task will still persist in the system.

In [11]:
# Global UUID is generated for agent and task
import uuid

AGENT_UUID = uuid.uuid4()
TASK_UUID = uuid.uuid4() 

### Creating Julep Client with the API Key

In [None]:
from julep import Client

api_key = "" # Your API key here

# Create a client
client = Client(api_key=api_key, environment="dev")

## Creating an "agent"


Agent is the object to which LLM settings, like model, temperature along with tools are scoped to.

To learn more about the agent, please refer to the [documentation](https://github.com/julep-ai/julep/blob/dev/docs/julep-concepts.md#agent).

In [None]:
# Defining the agent
name = "Jarvis"
about = "The original AI conscious the Iron Man."
default_settings = {
    "temperature": 0.7,
    "top_p": 1,
    "min_p": 0.01,
    "presence_penalty": 0,
    "frequency_penalty": 0,
    "length_penalty": 1.0,
    "max_tokens": 150,
}


# Create the agent
agent = client.agents.create_or_update(
    agent_id=AGENT_UUID,
    name=name,
    about=about,
    model="gpt-4o",
)

## Listing all the agents

We list the agents to see what agents are available.

In [None]:
# Listing all agents available in the account
get_agents = client.agents.list()
get_agents.items

We get the current agent ID from the list of agents.

In [None]:
# get the agent id
agent_id = get_agents.items[0].id
agent_id

## Defining a Task

Tasks in Julep are Github Actions style workflows that define long-running, multi-step actions. 
You can use them to conduct complex actions by defining them step-by-step. They have access to all Julep integrations.

To learn more about tasks, visit [Julep documentation](https://github.com/julep-ai/julep/blob/dev/docs/julep-concepts.md#task).

In [16]:
import yaml

Here is a Task which uses a agent to generate a sarcastic response to a given text using a DuckDuckGo search tool.

More on how to define a task can be found [here](https://github.com/julep-ai/julep/blob/dev/docs/julep-concepts.md).

In [17]:
# defining the task definition
task_def = yaml.safe_load("""
name: Agent Crawler

tools:
- name: spider_crawler
  type: integration
  integration:
    provider: spider
    setup:
      spider_api_key: "your_spider_api_key"

main:
- tool: spider_crawler
  arguments:
    url: '"https://spider.cloud"'
""")

Creating/Updating a task to generate a sarcastic response to a given text using a Intergation.

In [18]:
# creating the task object
task = client.tasks.create_or_update(
    task_id=TASK_UUID,
    agent_id=AGENT_UUID,
    **task_def
)

In [None]:
#listing all tasks for a given agent
get_tasks = client.tasks.list(agent_id=AGENT_UUID)
get_tasks.items 

## Creating an Execution

An execution is a single run of a task. It is a way to run a task with a specific set of inputs.

Creates a execution worflow for the Task defined in the yaml file.

In [21]:
# creating an execution object
execution = client.executions.create(
    task_id=TASK_UUID,
    input={}
)

In [None]:
# getting the execution details
execution = client.executions.get(execution.id)
#printing the output
execution.output

Streaming execution steps of the defined Task.

In [None]:
client.executions.transitions.stream(execution_id=execution.id)

Lisitng all the executions states of the task.

In [None]:
client.executions.transitions.list(execution_id=execution.id).items