# Oxylabs Search


This notebook goes over how to use the Oxylabs search component.

First, you need to set up the proper API keys and environment variables. Create your API user credentials: Sign up for a free trial or purchase the product in the [Oxylabs dashboard](https://dashboard.oxylabs.io/en/registration) to create your API user credentials (OXYLABS_USERNAME and OXYLABS_PASSWORD).



## Setup
The integration lives in the `langchain-community` package.

In [1]:
# %pip install -U langchain-community oxylabs

In [2]:
import getpass
import os
from pprint import pprint

os.environ["OXYLABS_USERNAME"] = getpass.getpass()
os.environ["OXYLABS_PASSWORD"] = getpass.getpass()

## Basic usage

In [3]:
from langchain_community.utilities import OxylabsSearchAPIWrapper

search = OxylabsSearchAPIWrapper()

In [4]:
pprint(search.run("Python Programming Language"))

('  KNOWLEDGE GRAPH: \n'
 '  TITLE: Python\n'
 '  IMAGES ITEMS: \n'
 '    IMAGES-ITEM-1: Redacted base64 image string...\n'
 '    IMAGES-ITEM-2: Redacted base64 image string...\n'
 '    IMAGES-ITEM-3: Redacted base64 image string...\n'
 '    IMAGES-ITEM-4: Redacted base64 image string...\n'
 '    IMAGES-ITEM-5: Redacted base64 image string...\n'
 '  FACTOIDS ITEMS: \n'
 '    FACTOIDS-ITEM-1: \n'
 '    LINKS ITEMS: \n'
 '      LINKS-ITEM-1: \n'
 '      HREF: '
 '/search?num=5&sca_esv=808c4f249ba0e78b&gl=us&hl=en&q=Guido+van+Rossum&stick=H4sIAAAAAAAAAONgVuLUz9U3MMwwME1exCrgXpqZkq9QlpinEJRfXFyaCwDpnKYPIAAAAA&sa=X&ved=2ahUKEwiD79n0xb2JAxXHd2wGHY51MFYQmxMoAHoECDEQAg\n'
 '      TITLE: Guido van Rossum\n'
 '    TITLE: Designed by\n'
 '    CONTENT: Guido van Rossum\n'
 '    FACTOIDS-ITEM-2: \n'
 '    TITLE: Developer\n'
 '    CONTENT: Python Software Foundation\n'
 '    FACTOIDS-ITEM-3: \n'
 '    TITLE: Filename extensions\n'
 '    CONTENT: .py,.pyw,.pyz,.pyi,.pyc,.pyd\n'
 '    FACTOIDS-ITEM-4

## Number of results
You may use the parameters `start_page`, `pages` and `limit` to specify the number of results. The default values are:  
    * `start_page = 1`  
    * `pages = 1`  
    * `limit = 5`  

In [5]:
search = OxylabsSearchAPIWrapper(
    params={
        "start_page": 1,
        "pages": 2,
        "limit": 10
    }
)


In [6]:
pprint(search.run("Python Programming Language"))

('  KNOWLEDGE GRAPH: \n'
 '  TITLE: Python\n'
 '  IMAGES ITEMS: \n'
 '    IMAGES-ITEM-1: Redacted base64 image string...\n'
 '    IMAGES-ITEM-2: Redacted base64 image string...\n'
 '    IMAGES-ITEM-3: Redacted base64 image string...\n'
 '    IMAGES-ITEM-4: Redacted base64 image string...\n'
 '    IMAGES-ITEM-5: Redacted base64 image string...\n'
 '  FACTOIDS ITEMS: \n'
 '    FACTOIDS-ITEM-1: \n'
 '    LINKS ITEMS: \n'
 '      LINKS-ITEM-1: \n'
 '      HREF: '
 '/search?sca_esv=808c4f249ba0e78b&gl=us&hl=en&q=Guido+van+Rossum&stick=H4sIAAAAAAAAAONgVuLUz9U3MMwwME1exCrgXpqZkq9QlpinEJRfXFyaCwDpnKYPIAAAAA&sa=X&ved=2ahUKEwju6Nz9xb2JAxXGqVYBHX_nOGQQmxMoAHoECD4QAg\n'
 '      TITLE: Guido van Rossum\n'
 '    TITLE: Designed by\n'
 '    CONTENT: Guido van Rossum\n'
 '    FACTOIDS-ITEM-2: \n'
 '    TITLE: Developer\n'
 '    CONTENT: Python Software Foundation\n'
 '    FACTOIDS-ITEM-3: \n'
 '    TITLE: Filename extensions\n'
 '    CONTENT: .py,.pyw,.pyz,.pyi,.pyc,.pyd\n'
 '    FACTOIDS-ITEM-4: \n'


## Result Filtering / Ordering

### Default result group order :  
* `knowledge_graph`  
* `combined_search_result`  
* `product_information`  
* `local_information`  
* `search_information`  
  
You may want to adjust the display order of result groups and exclude the `product_information`, `local_information`  and `combined_search_result` groups in cases where only abstract knowledge is needed. For example, this configuration allows tailored result presentation.

In [7]:
search = OxylabsSearchAPIWrapper(
    params={
        "start_page": 1,
        "pages": 1,
        "limit": 10,
        "geo_location": "Kairo,Egypt",
        "result_categories": [
            "search_information",
            "knowledge_graph",            
        ]
    }
)
pprint(search.run("Visiting Great Pyramid of Giza in Egypt"), indent=4)


('  SEARCH INFORMATION: \n'
 '  QUERY: Visiting Great Pyramid of Giza in Egypt\n'
 '  GEO_LOCATION: Cairo, Egypt\n'
 '  SHOWING_RESULTS_FOR: Visiting Great Pyramid of Giza in Egypt\n'
 '  TOTAL_RESULTS_COUNT: 6870000\n'
 '\n'
 '  RELATED SEARCHES: \n'
 '  RELATED_SEARCHES ITEMS: \n'
 '    RELATED_SEARCHES-ITEM-1: Pyramids of Giza entrance fee\n'
 '    RELATED_SEARCHES-ITEM-2: Best time to visit pyramids\n'
 '    RELATED_SEARCHES-ITEM-3: Visiting the pyramids without a guide\n'
 '    RELATED_SEARCHES-ITEM-4: Giza Pyramids official website\n'
 '    RELATED_SEARCHES-ITEM-5: Where to stay in Egypt to see pyramids\n'
 '    RELATED_SEARCHES-ITEM-6: Can you visit the Pyramids from Sharm El '
 'Sheikh\n'
 '    RELATED_SEARCHES-ITEM-7: Giza Pyramids tickets office and entrance\n'
 '    RELATED_SEARCHES-ITEM-8: Pyramids ticket price for foreigners\n'
 '\n'
 '  RELATED QUESTIONS: \n'
 '  ITEMS ITEMS: \n'
 '    ITEMS-ITEM-1: \n'
 '    POS: 1\n'
 '    ANSWER: The interiors of all three pyramids of 

You may want to adjust the display order of result groups and exclude abstract knowledge when focusing specifically on products and local services. In the following example reorders result groups to prioritise search information and local services, omitting abstract knowledge to enhance relevance for this context.

In [8]:
search = OxylabsSearchAPIWrapper(
    params={
        "start_page": 1,
        "pages": 5,
        "limit": 15,
        "geo_location": "Belgium",
        "result_categories": [
            "search_information",
            "local_information",            
        ]
    }
)
pprint(search.run("Open Working Space in Belgium."))

('  SEARCH INFORMATION: \n'
 '  QUERY: Open Working Space in Belgium.\n'
 '  GEO_LOCATION: Lokeren\n'
 '  SHOWING_RESULTS_FOR: Open Working Space in Belgium.\n'
 '  TOTAL_RESULTS_COUNT: 170000000\n'
 '  NO_RESULTS_FOR_ORIGINAL_QUERY_FOUND: True\n'
 '\n'
 '  RELATED SEARCHES: \n'
 '  RELATED_SEARCHES ITEMS: \n'
 '    RELATED_SEARCHES-ITEM-1: Coworking Antwerpen\n'
 '    RELATED_SEARCHES-ITEM-2: Firma II\n'
 '    RELATED_SEARCHES-ITEM-3: Spaces Tour & Taxis\n'
 '    RELATED_SEARCHES-ITEM-4: Silversquare\n'
 '\n'
 '  RELATED QUESTIONS: \n'
 '  ITEMS ITEMS: \n'
 '    ITEMS-ITEM-1: \n'
 '    POS: 1\n'
 '    ANSWER: Roam is an immersive platform that gives remote companies their '
 'own virtual office for their colleagues, guests, customers, and professional '
 'network.\n'
 '    SOURCE: \n'
 '      SOURCE: \n'
 '      URL: '
 'https://ro.am/features/roam-meetings#:~:text=Roam%20is%20an%20immersive%20platform,%2C%20customers%2C%20and%20professional%20network.\n'
 '      TITLE: Meetings - Roa

## Raw Results

In [9]:
search = OxylabsSearchAPIWrapper()

In [10]:
search.results("apples")

[{'paid': [],
  'organic': [{'pos': 1,
    'url': 'https://en.wikipedia.org/wiki/Apple',
    'desc': 'An apple is a round, edible fruit produced by an apple tree Apple trees are cultivated worldwide and are the most widely grown species in the genus Malus.',
    'title': 'Apple',
    'sitelinks': {'inline': [{'url': 'https://en.wikipedia.org/wiki/List_of_apple_cultivars',
       'title': 'List of apple cultivars'},
      {'url': 'https://en.wikipedia.org/wiki/Apple_Inc.',
       'title': 'Apple Inc.'},
      {'url': 'https://en.wikipedia.org/wiki/List_of_countries_by_apple_production',
       'title': 'Countries by apple production'},
      {'url': 'https://en.wikipedia.org/wiki/Apple_seed_oil',
       'title': 'Apple seed oil'}]},
    'url_shown': 'https://en.wikipedia.org› wiki › Apple',
    'pos_overall': 1,
    'favicon_text': 'Wikipedia'},
   {'pos': 2,
    'url': 'https://www.apple.com/',
    'desc': 'Discover the innovative world of Apple and shop everything iPhone, iPad, Apple 

## Tool Usage

In [11]:
import os

from langchain_community.tools.oxylabs_search import OxylabsSearchResults, OxylabsSearchRun
from langchain_community.utilities import OxylabsSearchAPIWrapper

api_wrapper = OxylabsSearchAPIWrapper()


In [12]:
tool_results = OxylabsSearchResults(wrapper=api_wrapper)
tool_results

OxylabsSearchResults(wrapper=OxylabsSearchAPIWrapper(include_binary_image_data=False, result_categories=[], parsing_recursion_depth=5, search_engine=<oxylabs.sources.serp.google.google.Google object at 0x7d1140ebe380>, params={'source': 'google_search', 'user_agent_type': 'desktop', 'render': 'html', 'domain': 'com', 'start_page': 1, 'pages': 1, 'limit': 5, 'parse': True, 'locale': '', 'geo_location': '', 'parsing_instructions': {}, 'context': [], 'request_timeout': 165, 'result_categories': []}, excluded_result_attributes=['pos_overall'], image_binary_content_attributes=['image_data', 'data'], image_binary_content_array_attribute='images', oxylabs_username='telesoftas-test', oxylabs_password='e2dKEh7HngSm_AGzQ2Dyt'), kwargs={})

In [13]:
tool_run = OxylabsSearchRun(wrapper=api_wrapper)
tool_run

OxylabsSearchRun(wrapper=OxylabsSearchAPIWrapper(include_binary_image_data=False, result_categories=[], parsing_recursion_depth=5, search_engine=<oxylabs.sources.serp.google.google.Google object at 0x7d1140ebe380>, params={'source': 'google_search', 'user_agent_type': 'desktop', 'render': 'html', 'domain': 'com', 'start_page': 1, 'pages': 1, 'limit': 5, 'parse': True, 'locale': '', 'geo_location': '', 'parsing_instructions': {}, 'context': [], 'request_timeout': 165, 'result_categories': []}, excluded_result_attributes=['pos_overall'], image_binary_content_attributes=['image_data', 'data'], image_binary_content_array_attribute='images', oxylabs_username='telesoftas-test', oxylabs_password='e2dKEh7HngSm_AGzQ2Dyt'), kwargs={})

In [14]:
import json

# .invoke wraps utility.results
response_results = tool_results.invoke({
    "query": "What is the weather in Shanghai?",
    "geo_location": "China",  
})
response_results = json.loads(response_results)
for item in response_results:
    print(item)

{'paid': [], 'organic': [{'pos': 1, 'url': 'https://www.accuweather.com/en/cn/shanghai/106577/weather-forecast/106577', 'desc': 'Hourly Weather · 1 AM 61°. rain drop 7% · 2 AM 61°. rain drop 7% · 3 AM 61°. rain drop 7% · 4 AM 61°.', 'title': 'Shanghai, Shanghai, China Weather Forecast', 'sitelinks': {'inline': [{'url': 'https://www.accuweather.com/en/cn/shanghai/106577/daily-weather-forecast/106577', 'title': 'Daily'}, {'url': 'https://www.accuweather.com/en/cn/shanghai/106577/current-weather/106577', 'title': 'Current Weather'}, {'url': 'https://www.accuweather.com/en/cn/shanghai/106577/hourly-weather-forecast/106577', 'title': 'Hourly'}, {'url': 'https://www.accuweather.com/en/cn/shanghai/106577/october-weather/106577', 'title': 'Monthly'}]}, 'url_shown': 'https://www.accuweather.com › shanghai › weather-for...', 'pos_overall': 1, 'favicon_text': 'AccuWeather'}, {'pos': 2, 'url': 'https://weather.com/weather/tenday/l/7f14186934f484d567841e8646abc61b81cce4d88470d519beeb5e115c9b425a', 

In [15]:
# .invoke wraps utility.results
response_run = tool_run.invoke({
    "query": "What is the weather in Shanghai?",
    "geo_location": "China",    
})

pprint(response_run)

('  ORGANIC RESULTS ITEMS: \n'
 '    ORGANIC-ITEM-1: \n'
 '    POS: 1\n'
 '    URL: '
 'https://www.accuweather.com/en/cn/shanghai/106577/weather-forecast/106577\n'
 '    DESC: Hourly Weather · 1 AM 61°. rain drop 7% · 2 AM 61°. rain drop 7% · '
 '3 AM 61°. rain drop 7% · 4 AM 61°.\n'
 '    TITLE: Shanghai, Shanghai, China Weather Forecast\n'
 '    SITELINKS: \n'
 '      SITELINKS: \n'
 '      INLINE ITEMS: \n'
 '        INLINE-ITEM-1: \n'
 '        URL: '
 'https://www.accuweather.com/en/cn/shanghai/106577/daily-weather-forecast/106577\n'
 '        TITLE: Daily\n'
 '        INLINE-ITEM-2: \n'
 '        URL: '
 'https://www.accuweather.com/en/cn/shanghai/106577/current-weather/106577\n'
 '        TITLE: Current Weather\n'
 '        INLINE-ITEM-3: \n'
 '        URL: '
 'https://www.accuweather.com/en/cn/shanghai/106577/hourly-weather-forecast/106577\n'
 '        TITLE: Hourly\n'
 '        INLINE-ITEM-4: \n'
 '        URL: '
 'https://www.accuweather.com/en/cn/shanghai/106577/october-wea

# Chaining

In [16]:
# Install the required libraries
%pip install --upgrade --quiet langchain langchain-openai langchainhub langchain-community


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.1[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [None]:
# Import necessary modules
import getpass
import os
from langchain import hub
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_openai import ChatOpenAI  # Use OpenAIChat instead of AzureChatOpenAI

# Set up OpenAI API credentials
os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

# Define assistant instructions and pull a base prompt template
instructions = """You are an assistant."""
base_prompt = hub.pull("langchain-ai/openai-functions-template")
prompt = base_prompt.partial(instructions=instructions)

# Initialize OpenAI chat model
llm = ChatOpenAI(
    openai_api_key=os.environ["OPENAI_API_KEY"]
)

# Define tools and agent setup
tool = OxylabsSearchRun(wrapper=api_wrapper)
tools = [tool]
agent = create_tool_calling_agent(llm, tools, prompt)

# Set up and invoke the agent executor
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
)
agent_executor.invoke({"input": "What happened in the latest Burning Man floods?"})

In [None]:
agent_executor.invoke({"input": "Who won latest 2024 Lithuanian elections and why, please make an politological analysis essay from search results."})

In [None]:
agent_executor.invoke({"input": "What is the most profitable company in lithuania in 2024?"})