Skip to content

Commit

Permalink
.Net: Added better formatting for responses from Bing Searches & Abil…
Browse files Browse the repository at this point in the history
…ity to use custom Bing Search endpoint. (#5673)

Motivation and Context
Returning just the page snippet from a Bing search results in a lack of
context or ability to follow up on a response.

This PR addresses that by adding better formatting for Bing Search
responses via the "Plugins.Web" package.
It also adds a method to use a custom endpoint for Bing Search.

Description
Formatting Changes
Previous response format example:

![image](https://github.com/microsoft/semantic-kernel/assets/95053834/afad8167-0495-4a99-8975-b2090920cfd4)

New response format example:

![image](https://github.com/microsoft/semantic-kernel/assets/95053834/afad8167-0495-4a99-8975-b2090920cfd4)

As can be seen by the examples provided, the new response format
contains much more context & provides the user with the URL of the
search result so that they can click through and read further.

This can be obtained programmatically later on using regex matching,
then could be fed into the "[Search Url
Plugin](https://github.com/microsoft/semantic-kernel/blob/main/dotnet/src/Plugins/Plugins.Web/SearchUrlPlugin.cs)",
to scrape the page directly, before finally summarizing and returning a
complete summary of the page to the user.

Endpoint Changes
The ability to use a custom endpoint for Bing Search also enables a path
to use an API Management instance as a front end for the Bing Search
API. This is required by some end users for scenarios such as enterprise
logging and usage counting for cross charge.

The default behaviour for the BingConnector is to target the Bing Search
endpoint directly. It is entirely optional to use a custom endpoint.
Doing so, does not introduce any other code change requirements to
achieve use of Bing Search.

Contribution Checklist
 The code builds clean without any errors or warnings
The PR follows the [SK Contribution
Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
and the [pre-submission formatting
script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#dev-scripts)
raises no violations
 All unit tests pass, and I have added new tests where possible
 I didn't break anyone 😄

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Mark Karle <mkarle@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Lee Miller <lemiller@microsoft.com>
Co-authored-by: Shawn Callegari <36091529+shawncal@users.noreply.github.com>
Co-authored-by: Roger Barreto <19890735+RogerBarreto@users.noreply.github.com>
Co-authored-by: Mark Wallace <127216156+markwallace-microsoft@users.noreply.github.com>
Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
Co-authored-by: SergeyMenshykh <68852919+SergeyMenshykh@users.noreply.github.com>
Co-authored-by: Lisa Harrylock <lisaharrylock@gmail.com>
Co-authored-by: Shay Rojansky <roji@roji.org>
Co-authored-by: Anthony Puppo <anthonyosx@gmail.com>
Co-authored-by: Weihan Li <weihanli@outlook.com>
Co-authored-by: Jadyn <jadyn.wong@live.com>
Co-authored-by: Matthew Bolaños <matthewbolanos@gmail.com>
Co-authored-by: Abby Harrison <54643756+awharrison-28@users.noreply.github.com>
Co-authored-by: Weihan Li <weihan.li@iherb.com>
Co-authored-by: Gina Triolo <51341242+gitri-ms@users.noreply.github.com>
Co-authored-by: Jib <Jibzade@gmail.com>
Co-authored-by: feiyun0112 <feiyun0112@gmail.com>
Co-authored-by: Adarsh Acharya <132294330+AdarshAcharya5@users.noreply.github.com>
Co-authored-by: Abby Harrison <abby.harrison@microsoft.com>
Co-authored-by: Jib <jib.adegunloye@mongodb.com>
Co-authored-by: Steven Silvester <steven.silvester@ieee.org>
Co-authored-by: Roybott <RoyHerrod@Outlook.com>
Co-authored-by: John Liu <107901166+johnliu55-msft@users.noreply.github.com>
Co-authored-by: zhaozhiming <zhaozhiming@users.noreply.github.com>
Co-authored-by: Hiroshi Yoshioka <40815708+hyoshioka0128@users.noreply.github.com>
Co-authored-by: SergeyMenshykh <sergemenshikh@gmail.com>
Co-authored-by: Sun Zhigang <sunner@gmail.com>
Co-authored-by: Stephen Toub <stoub@microsoft.com>
Co-authored-by: RonSijm <RonSijm@users.noreply.github.com>
Co-authored-by: Joowon <joowon.kim@dm.snu.ac.kr>
Co-authored-by: Devis Lucato <dluc@users.noreply.github.com>
Co-authored-by: Devis Lucato <devis@microsoft.com>
Co-authored-by: Gil LaHaye <gillahaye@microsoft.com>
Co-authored-by: kevin-m-kent <38162246+kevin-m-kent@users.noreply.github.com>
Co-authored-by: Kevin Kent <kevinkent@NU-kkent-M.local>
  • Loading branch information
Show file tree
Hide file tree
Showing 6 changed files with 182 additions and 53 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ public async Task SearchAsyncSucceedsAsync()
IEnumerable<string> expected = new[] { Guid.NewGuid().ToString() };

Mock<IWebSearchEngineConnector> connectorMock = new();
connectorMock.Setup(c => c.SearchAsync(It.IsAny<string>(), It.IsAny<int>(), It.IsAny<int>(), It.IsAny<CancellationToken>()))
connectorMock.Setup(c => c.SearchAsync<string>(It.IsAny<string>(), It.IsAny<int>(), It.IsAny<int>(), It.IsAny<CancellationToken>()))
.ReturnsAsync(expected);

WebSearchEnginePlugin target = new(connectorMock.Object);
Expand All @@ -32,4 +32,25 @@ public async Task SearchAsyncSucceedsAsync()
// Assert
connectorMock.VerifyAll();
}

[Fact]
public async Task GetSearchResultsSucceedsAsync()
{
// Arrange
IEnumerable<WebPage> expected = new List<WebPage>();

Mock<IWebSearchEngineConnector> connectorMock = new();
connectorMock.Setup(c => c.SearchAsync<WebPage>(It.IsAny<string>(), It.IsAny<int>(), It.IsAny<int>(), It.IsAny<CancellationToken>()))
.ReturnsAsync(expected);

WebSearchEnginePlugin target = new(connectorMock.Object);

string anyQuery = Guid.NewGuid().ToString();

// Act
await target.GetSearchResultsAsync(anyQuery);

// Assert
connectorMock.VerifyAll();
}
}
82 changes: 36 additions & 46 deletions dotnet/src/Plugins/Plugins.Web/Bing/BingConnector.cs
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,9 @@

using System;
using System.Collections.Generic;
using System.Diagnostics.CodeAnalysis;
using System.Linq;
using System.Net.Http;
using System.Text.Json;
using System.Text.Json.Serialization;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Logging;
Expand All @@ -23,14 +21,17 @@ public sealed class BingConnector : IWebSearchEngineConnector
private readonly ILogger _logger;
private readonly HttpClient _httpClient;
private readonly string? _apiKey;
private readonly Uri? _uri = null;
private const string DefaultUri = "https://api.bing.microsoft.com/v7.0/search?q";

/// <summary>
/// Initializes a new instance of the <see cref="BingConnector"/> class.
/// </summary>
/// <param name="apiKey">The API key to authenticate the connector.</param>
/// <param name="uri">The URI of the Bing Search instance. Defaults to "https://api.bing.microsoft.com/v7.0/search?q".</param>
/// <param name="loggerFactory">The <see cref="ILoggerFactory"/> to use for logging. If null, no logging will be performed.</param>
public BingConnector(string apiKey, ILoggerFactory? loggerFactory = null) :
this(apiKey, HttpClientProvider.GetHttpClient(), loggerFactory)
public BingConnector(string apiKey, Uri? uri = null, ILoggerFactory? loggerFactory = null) :
this(apiKey, HttpClientProvider.GetHttpClient(), uri, loggerFactory)
{
}

Expand All @@ -39,8 +40,9 @@ public sealed class BingConnector : IWebSearchEngineConnector
/// </summary>
/// <param name="apiKey">The API key to authenticate the connector.</param>
/// <param name="httpClient">The HTTP client to use for making requests.</param>
/// <param name="uri">The URI of the Bing Search instance. Defaults to "https://api.bing.microsoft.com/v7.0/search?q".</param>
/// <param name="loggerFactory">The <see cref="ILoggerFactory"/> to use for logging. If null, no logging will be performed.</param>
public BingConnector(string apiKey, HttpClient httpClient, ILoggerFactory? loggerFactory = null)
public BingConnector(string apiKey, HttpClient httpClient, Uri? uri = null, ILoggerFactory? loggerFactory = null)
{
Verify.NotNull(httpClient);

Expand All @@ -49,22 +51,18 @@ public BingConnector(string apiKey, HttpClient httpClient, ILoggerFactory? logge
this._httpClient = httpClient;
this._httpClient.DefaultRequestHeaders.Add("User-Agent", HttpHeaderConstant.Values.UserAgent);
this._httpClient.DefaultRequestHeaders.Add(HttpHeaderConstant.Names.SemanticKernelVersion, HttpHeaderConstant.Values.GetAssemblyVersion(typeof(BingConnector)));
this._uri = uri ?? new Uri(DefaultUri);
}

/// <inheritdoc/>
public async Task<IEnumerable<string>> SearchAsync(string query, int count = 1, int offset = 0, CancellationToken cancellationToken = default)
public async Task<IEnumerable<T>> SearchAsync<T>(string query, int count = 1, int offset = 0, CancellationToken cancellationToken = default)
{
if (count is <= 0 or >= 50)
{
throw new ArgumentOutOfRangeException(nameof(count), count, $"{nameof(count)} value must be greater than 0 and less than 50.");
}

if (offset < 0)
{
throw new ArgumentOutOfRangeException(nameof(offset));
}

Uri uri = new($"https://api.bing.microsoft.com/v7.0/search?q={Uri.EscapeDataString(query)}&count={count}&offset={offset}");
Uri uri = new($"{this._uri}={Uri.EscapeDataString(query.Trim())}&count={count}&offset={offset}");

this._logger.LogDebug("Sending request: {Uri}", uri);

Expand All @@ -77,11 +75,33 @@ public async Task<IEnumerable<string>> SearchAsync(string query, int count = 1,
// Sensitive data, logging as trace, disabled by default
this._logger.LogTrace("Response content received: {Data}", json);

BingSearchResponse? data = JsonSerializer.Deserialize<BingSearchResponse>(json);
WebSearchResponse? data = JsonSerializer.Deserialize<WebSearchResponse>(json);

WebPage[]? results = data?.WebPages?.Value;

return results == null ? Enumerable.Empty<string>() : results.Select(x => x.Snippet);
List<T>? returnValues = new();
if (data?.WebPages?.Value != null)
{
if (typeof(T) == typeof(string))
{
WebPage[]? results = data?.WebPages?.Value;
returnValues = results?.Select(x => x.Snippet).ToList() as List<T>;
}
else if (typeof(T) == typeof(WebPage))
{
List<WebPage>? webPages = new();

foreach (var webPage in data.WebPages.Value)

{
webPages.Add(webPage);
}
returnValues = webPages.Take(count).ToList() as List<T>;
}
else
{
throw new NotSupportedException($"Type {typeof(T)} is not supported.");
}
}
return returnValues != null && returnValues.Count == 0 ? returnValues : returnValues.Take(count);
}

/// <summary>
Expand All @@ -101,34 +121,4 @@ private async Task<HttpResponseMessage> SendGetRequestAsync(Uri uri, Cancellatio

return await this._httpClient.SendWithSuccessCheckAsync(httpRequestMessage, cancellationToken).ConfigureAwait(false);
}

[SuppressMessage("Performance", "CA1812:Internal class that is apparently never instantiated",
Justification = "Class is instantiated through deserialization.")]
private sealed class BingSearchResponse
{
[JsonPropertyName("webPages")]
public WebPages? WebPages { get; set; }
}

[SuppressMessage("Performance", "CA1812:Internal class that is apparently never instantiated",
Justification = "Class is instantiated through deserialization.")]
private sealed class WebPages
{
[JsonPropertyName("value")]
public WebPage[]? Value { get; set; }
}

[SuppressMessage("Performance", "CA1812:Internal class that is apparently never instantiated",
Justification = "Class is instantiated through deserialization.")]
private sealed class WebPage
{
[JsonPropertyName("name")]
public string Name { get; set; } = string.Empty;

[JsonPropertyName("url")]
public string Url { get; set; } = string.Empty;

[JsonPropertyName("snippet")]
public string Snippet { get; set; } = string.Empty;
}
}
31 changes: 29 additions & 2 deletions dotnet/src/Plugins/Plugins.Web/Google/GoogleConnector.cs
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ public sealed class GoogleConnector : IWebSearchEngineConnector, IDisposable
}

/// <inheritdoc/>
public async Task<IEnumerable<string>> SearchAsync(
public async Task<IEnumerable<T>> SearchAsync<T>(
string query,
int count,
int offset,
Expand All @@ -80,7 +80,34 @@ public sealed class GoogleConnector : IWebSearchEngineConnector, IDisposable

var results = await search.ExecuteAsync(cancellationToken).ConfigureAwait(false);

return results.Items.Select(item => item.Snippet);
List<T>? returnValues = new();
if (results.Items != null)
{
if (typeof(T) == typeof(string))
{
returnValues = results.Items.Select(item => item.Snippet).ToList() as List<T>;
}
else if (typeof(T) == typeof(WebPage))
{
List<WebPage> webPages = new();
foreach (var item in results.Items)
{
WebPage webPage = new()
{
Name = item.Title,
Snippet = item.Snippet,
Url = item.Link
};
webPages.Add(webPage);
}
returnValues = webPages.Take(count).ToList() as List<T>;
}
else
{
throw new NotSupportedException($"Type {typeof(T)} is not supported.");
}
}
return returnValues != null && returnValues.Count == 0 ? returnValues : returnValues.Take(count);
}

/// <summary>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,5 @@ public interface IWebSearchEngineConnector
/// <param name="offset">Number of results to skip.</param>
/// <param name="cancellationToken">The <see cref="CancellationToken"/> to monitor for cancellation requests. The default is <see cref="CancellationToken.None"/>.</param>
/// <returns>First snippet returned from search.</returns>
Task<IEnumerable<string>> SearchAsync(string query, int count = 1, int offset = 0, CancellationToken cancellationToken = default);
Task<IEnumerable<T>> SearchAsync<T>(string query, int count = 1, int offset = 0, CancellationToken cancellationToken = default);
}
58 changes: 58 additions & 0 deletions dotnet/src/Plugins/Plugins.Web/WebPage.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
// Copyright (c) Microsoft. All rights reserved.

using System.Diagnostics.CodeAnalysis;
using System.Text.Json.Serialization;

namespace Microsoft.SemanticKernel.Plugins.Web;

/// <summary>
/// A sealed class containing the deserialized response from the respective Web Search API.
/// </summary>
/// <returns>A WebPage object containing the Web Search API response data.</returns>
[SuppressMessage("Performance", "CA1056:Change the type of parameter 'uri'...",
Justification = "A constant Uri cannot be defined, as required by this class")]
public sealed class WebPage
{
/// <summary>
/// The name of the result.
/// </summary>
[JsonPropertyName("name")]
public string Name { get; set; } = string.Empty;
/// <summary>
/// The URL of the result.
/// </summary>
[JsonPropertyName("url")]
public string Url { get; set; } = string.Empty;
/// <summary>
/// The result snippet.
/// </summary>
[JsonPropertyName("snippet")]
public string Snippet { get; set; } = string.Empty;
}

/// <summary>
/// A sealed class containing the deserialized response from the respective Web Search API.
/// </summary>
/// <returns>A WebPages? object containing the WebPages array from a Search API response data or null.</returns>
public sealed class WebSearchResponse
{
/// <summary>
/// A nullable WebPages object containing the Web Search API response data.
/// </summary>
[JsonPropertyName("webPages")]
public WebPages? WebPages { get; set; }
}

/// <summary>
/// A sealed class containing the deserialized response from the Web respective Search API.
/// </summary>
/// <returns>A WebPages array object containing the Web Search API response data.</returns>
[SuppressMessage("Performance", "CA1819:Properties should not return arrays", Justification = "Required by the Web Search API")]
public sealed class WebPages
{
/// <summary>
/// a nullable WebPage array object containing the Web Search API response data.
/// </summary>
[JsonPropertyName("value")]
public WebPage[]? Value { get; set; }
}
39 changes: 36 additions & 3 deletions dotnet/src/Plugins/Plugins.Web/WebSearchEnginePlugin.cs
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
// Copyright (c) Microsoft. All rights reserved.

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Linq;
using System.Text.Encodings.Web;
Expand Down Expand Up @@ -63,14 +64,46 @@ public WebSearchEnginePlugin(IWebSearchEngineConnector connector)
[Description("Number of results to skip")] int offset = 0,
CancellationToken cancellationToken = default)
{
var results = (await this._connector.SearchAsync(query, count, offset, cancellationToken).ConfigureAwait(false)).ToArray();
if (results.Length == 0)
var results = await this._connector.SearchAsync<string>(query, count, offset, cancellationToken).ConfigureAwait(false);
if (!results.Any())
{
throw new InvalidOperationException("Failed to get a response from the web search engine.");
}

return count == 1
? results[0] ?? string.Empty
? results.First() ?? string.Empty
: JsonSerializer.Serialize(results, s_jsonOptionsCache);
}

/// <summary>
/// Performs a web search using the provided query, count, and offset.
/// </summary>
/// <param name="query">The text to search for.</param>
/// <param name="count">The number of results to return. Default is 1.</param>
/// <param name="offset">The number of results to skip. Default is 0.</param>
/// <param name="cancellationToken">A cancellation token to observe while waiting for the task to complete.</param>
/// <returns>The return value contains the search results as an IEnumerable WebPage object serialized as a string</returns>
[KernelFunction, Description("Perform a web search and return complete results.")]
public async Task<string> GetSearchResultsAsync(
[Description("Text to search for")] string query,
[Description("Number of results")] int count = 1,
[Description("Number of results to skip")] int offset = 0,
CancellationToken cancellationToken = default)
{
IEnumerable<WebPage>? results = null;
try
{
results = await this._connector.SearchAsync<WebPage>(query, count, offset, cancellationToken).ConfigureAwait(false);
if (!results.Any())
{
throw new InvalidOperationException("Failed to get a response from the web search engine.");
}
}
catch (InvalidOperationException ex)
{
Console.WriteLine(ex.Message);
}

return JsonSerializer.Serialize(results);
}
}

0 comments on commit 9d52fef

Please sign in to comment.