Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Adds SearchApi as WebSearchEngine and Tool #1216

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
1080630
added searchapi engine
zambrinf Jun 2, 2024
fe50b83
Resolves #1132 - added searchapi web search engine
zambrinf Jun 2, 2024
79e6e81
Merge branch 'langchain4j:main' into searchapi-websearchengine
zambrinf Jun 4, 2024
7a5656f
Merge branch 'main' into searchapi-websearchengine
zambrinf Jun 4, 2024
6f1c1f2
Merge branch 'main' into searchapi-websearchengine
zambrinf Jun 7, 2024
6ebfe8b
adjusting test
zambrinf Jun 30, 2024
9a8aba9
multiple strings to constant, minor changes to texts, removed chat me…
zambrinf Jul 6, 2024
e364031
Merge branch 'main' of https://github.com/langchain4j/langchain4j int…
zambrinf Jul 6, 2024
e7004d6
Merge branch 'main' into searchapi-websearchengine
zambrinf Jul 7, 2024
fe4c080
change to parent version
zambrinf Jul 7, 2024
1a6fdce
Merge branch 'searchapi-websearchengine' of https://github.com/zambri…
zambrinf Jul 7, 2024
3b6e010
Merge branch 'main' of https://github.com/langchain4j/langchain4j int…
zambrinf Jul 15, 2024
8b3408e
simplified the logic to only use the engines that return organic resu…
zambrinf Jul 15, 2024
7c8e271
fix total results
zambrinf Jul 15, 2024
981e1e2
small refactoring
zambrinf Jul 16, 2024
c97c0fd
making everything to be put in the results metadata
zambrinf Jul 16, 2024
f949d6c
adjusting metadata
zambrinf Jul 16, 2024
caef543
ordering
zambrinf Jul 16, 2024
647b992
renaming
zambrinf Jul 16, 2024
6807aa1
removing search-information
zambrinf Jul 16, 2024
6d168a6
search_information just in the metadata
zambrinf Jul 16, 2024
7496cd8
Merge branch 'main' into searchapi-websearchengine
zambrinf Jul 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions docs/docs/integrations/web-search/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Web Search",
"position": 19,
"link": {
"type": "generated-index",
"description": "Web Search"
}
}
109 changes: 109 additions & 0 deletions docs/docs/integrations/web-search/searchapi.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
---
sidebar_position: 1
---

# SearchApi

[SearchApi](https://www.searchapi.io/) is a real-time SERP API. You can use it to perform searches in Google, Google News, Bing, Bing News, Baidu, Google Scholar, or any other engine that returns organic results.

## Usage

### Dependencies setup

Add the following dependencies to your project's `pom.xml`:
```xml
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-web-search-engine-searchapi</artifactId>
<version>{your-version}</version> <!-- Specify langchain4j version here -->
</dependency>
```

or project's `build.gradle`:

```groovy
implementation 'dev.langchain4j:langchain4j-web-search-engine-searchapi:{your-version}'
```

### Example code:

```java
import dev.langchain4j.memory.chat.MessageWindowChatMemory;
import dev.langchain4j.model.chat.ChatLanguageModel;
import dev.langchain4j.model.openai.OpenAiChatModel;
import dev.langchain4j.model.openai.OpenAiChatModelName;
import dev.langchain4j.service.AiServices;
import dev.langchain4j.web.search.WebSearchTool;
import dev.langchain4j.web.search.searchapi.SearchApiEngine;
import dev.langchain4j.web.search.searchapi.SearchApiWebSearchEngine;

public class SearchApiTool {

interface Assistant {
@dev.langchain4j.service.SystemMessage({
"You are a web search support agent.",
"If there is any event that has not happened yet",
"You MUST create a web search request with user query and",
"use the web search tool to search the web for organic web results.",
"Include the source link in your final response."
})
String answer(String userMessage);
}

private static final String SEARCHAPI_API_KEY = "YOUR_SEARCHAPI_KEY";
private static final String OPENAI_API_KEY = "YOUR_OPENAI_KEY";

public static void main(String[] args) {
Map<String, Object> optionalParameters = new HashMap<>();
optionalParameters.put("gl", "us");
optionalParameters.put("hl", "en");
optionalParameters.put("google_domain", "google.com");

SearchApiWebSearchEngine searchEngine = SearchApiWebSearchEngine.builder()
.apiKey(SEARCHAPI_API_KEY)
.engine("google")
.optionalParameters(optionalParameters)
.build();
ChatLanguageModel chatModel = OpenAiChatModel.builder()
.apiKey(OPENAI_API_KEY)
.modelName(OpenAiChatModelName.GPT_3_5_TURBO)
.logRequests(true)
.build();

WebSearchTool webTool = WebSearchTool.from(searchEngine);

Assistant assistant = AiServices.builder(Assistant.class)
.chatLanguageModel(chatModel)
.tools(webTool)
.build();

String answer = assistant.answer("My family is coming to visit me in Madrid next week, list the best tourist activities suitable for the whole family");
zambrinf marked this conversation as resolved.
Show resolved Hide resolved
System.out.println(answer);
/*
Here are some of the best tourist activities suitable for the whole family in Madrid:

1. **Parque del Retiro** - A beautiful public park where families can enjoy nature and various activities.
2. **Prado Museum** - A renowned art museum that can be fascinating for both adults and children.
3. **Mercado de San Miguel** - A market where you can explore and taste delicious Spanish food.
4. **Royal Palace** - Explore the grandeur of the Royal Palace of Madrid.
5. **Plaza Mayor** and **Puerta del Sol** - Historic squares with a vibrant atmosphere.
6. **Santiago Bernabeu Stadium** - Perfect for sports enthusiasts and soccer fans.
7. **Gran Via** - A famous street for shopping, entertainment, and sightseeing.
8. **National Archaeological Museum** - Discover Spain's rich history through archaeological artifacts.
9. **Templo de Debod** - An ancient Egyptian temple in the heart of Madrid.
*/
}
}
```

### Available engines in Langchain4j

| SearchApi Engine | Available |
|-----------------------------------------------------------|-----------|
| [Google Web Search](https://www.searchapi.io/docs/google) | ✅ |
zambrinf marked this conversation as resolved.
Show resolved Hide resolved
| [Google News](https://www.searchapi.io/docs/google-news) | ✅ |
| [Bing](https://www.searchapi.io/docs/bing) | ✅ |
| [Bing News](https://www.searchapi.io/docs/bing-news) | ✅ |
| [Baidu](https://www.searchapi.io/docs/baidu) | ✅ |

Other engines that return the `organic_results` array and the organic result has `title`, `link`, and `snippet` is supported by this library even if not listed above.
10 changes: 10 additions & 0 deletions langchain4j-bom/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -417,6 +417,16 @@
<artifactId>langchain4j-web-search-engine-google-custom</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-web-search-engine-tavily</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-web-search-engine-searchapi</artifactId>
<version>${project.version}</version>
</dependency>

<!-- experimental -->
<dependency>
Expand Down
1 change: 1 addition & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@
<!-- web search engines -->
<module>web-search-engines/langchain4j-web-search-engine-google-custom</module>
<module>web-search-engines/langchain4j-web-search-engine-tavily</module>
<module>web-search-engines/langchain4j-web-search-engine-searchapi</module>

<!-- embedding store filter parsers -->
<module>embedding-store-filter-parsers/langchain4j-embedding-store-filter-parser-sql</module>
Expand Down
90 changes: 90 additions & 0 deletions web-search-engines/langchain4j-web-search-engine-searchapi/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.32.0</version>
<relativePath>../../langchain4j-parent/pom.xml</relativePath>
</parent>

<artifactId>langchain4j-web-search-engine-searchapi</artifactId>
<packaging>jar</packaging>

<name>LangChain4j :: Web Search Engine :: SearchApi</name>

<dependencies>

<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-core</artifactId>
</dependency>

<dependency>
<groupId>com.squareup.retrofit2</groupId>
<artifactId>retrofit</artifactId>
</dependency>

<dependency>
<groupId>com.squareup.retrofit2</groupId>
<artifactId>converter-gson</artifactId>
</dependency>
Comment on lines +31 to +34
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've started a few months ago to move JSON lib in LLM integration to Jackson to be compatible with java frameworks (spring, quarkus, etc). Any web search integration would be used with a LLM integration together in a normal way. It's not critical but as new integration would be great if you use Jackson mapper.

<dependency>
    <groupId>com.squareup.retrofit2</groupId>
    <artifactId>converter-jackson</artifactId>
</dependency>
<dependency>
    <groupId>com.fasterxml.jackson.core</groupId>
    <artifactId>jackson-databind</artifactId>
  </dependency>

Check Anthropic or Mistral AI integration.


<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
</dependency>

<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<scope>provided</scope>
</dependency>

<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter</artifactId>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.assertj</groupId>
<artifactId>assertj-core</artifactId>
<scope>test</scope>
</dependency>

<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j</artifactId>
<scope>test</scope>
</dependency>

<!-- Visibility for WebSearchEngineIT -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-core</artifactId>
<classifier>tests</classifier>
<type>test-jar</type>
<scope>test</scope>
</dependency>

<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai</artifactId>
<scope>test</scope>
</dependency>
</dependencies>

<licenses>
<license>
<name>Apache-2.0</name>
<url>https://www.apache.org/licenses/LICENSE-2.0.txt</url>
<distribution>repo</distribution>
<comments>A business-friendly OSS license</comments>
</license>
</licenses>

</project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
package dev.langchain4j.web.search.searchapi;

import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import lombok.Builder;
import okhttp3.OkHttpClient;
import okhttp3.ResponseBody;
import retrofit2.Response;
import retrofit2.Retrofit;
import retrofit2.converter.gson.GsonConverterFactory;

import java.io.IOException;
import java.time.Duration;
import java.util.HashMap;
import java.util.Map;

import static com.google.gson.FieldNamingPolicy.LOWER_CASE_WITH_UNDERSCORES;

class SearchApiClient {

private static final Gson GSON = new GsonBuilder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, Jackson mapper instead of

.setFieldNamingPolicy(LOWER_CASE_WITH_UNDERSCORES)
.setPrettyPrinting()
.create();

private final SearchApiWebSearchApi api;

@Builder
SearchApiClient(String baseUrl, Duration timeout) {
OkHttpClient.Builder okHttpClientBuilder = new OkHttpClient.Builder()
.callTimeout(timeout)
.connectTimeout(timeout)
.readTimeout(timeout)
.writeTimeout(timeout);
Retrofit retrofit = new Retrofit.Builder()
.baseUrl(baseUrl)
Comment on lines +35 to +36
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally your Builder class should validate If baseUrl is mandatory for the API. If you're using lombok could add them in your main constructor:

Suggested change
Retrofit retrofit = new Retrofit.Builder()
.baseUrl(baseUrl)
if (isNullOrBlank(baseUrl)) {
throw new IllegalArgumentException("baseUrl cannot be null or empty");
}
Retrofit retrofit = new Retrofit.Builder()
.baseUrl(baseUrl)

.client(okHttpClientBuilder.build())
.addConverterFactory(GsonConverterFactory.create(GSON))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, JacksonConverterFactory instead of

.build();
this.api = retrofit.create(SearchApiWebSearchApi.class);
}

SearchApiWebSearchResponse search(SearchApiWebSearchRequest request) {
Map<String, Object> finalParameters = new HashMap<>(request.getFinalOptionalParameters());
finalParameters.put("engine", request.getEngine());
finalParameters.put("q", request.getQuery());
String bearerToken = "Bearer " + request.getApiKey();
try {
Response<SearchApiWebSearchResponse> response = api.search(finalParameters, bearerToken).execute();
return getBody(response);
} catch (IOException e) {
throw new RuntimeException(e);
}
}

private SearchApiWebSearchResponse getBody(Response<SearchApiWebSearchResponse> response) throws IOException {
if (response.isSuccessful()) {
return response.body();
} else {
throw toException(response);
}
}

private static RuntimeException toException(Response<?> response) throws IOException {
try (ResponseBody responseBody = response.errorBody()) {
int code = response.code();
if (responseBody != null) {
String body = responseBody.string();
String errorMessage = String.format("status code: %s; body: %s", code, body);
return new RuntimeException(errorMessage);
} else {
return new RuntimeException(String.format("status code: %s;", code));
}
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
package dev.langchain4j.web.search.searchapi;

import retrofit2.Call;
import retrofit2.http.GET;
import retrofit2.http.Header;
import retrofit2.http.QueryMap;

import java.util.Map;

interface SearchApiWebSearchApi {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name of that class is weird, here you still don't use webSearch as such. A better name might be SearchApi for the interface and SearchApiClient for the client.


@GET("/api/v1/search")
Call<SearchApiWebSearchResponse> search(@QueryMap Map<String, Object> params,
@Header("Authorization") String bearerToken);
}
Loading