Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .changeset/smooth-parrots-speak.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
'@firebase/ai': minor
'firebase': minor
---

Add `inferenceSource` to the response from `generateContent` and `generateContentStream`. This property indicates whether on-device or in-cloud inference was used to generate the result.
11 changes: 11 additions & 0 deletions common/api-review/ai.api.md
Original file line number Diff line number Diff line change
Expand Up @@ -256,6 +256,8 @@ export { Date_2 as Date }
// @public
export interface EnhancedGenerateContentResponse extends GenerateContentResponse {
functionCalls: () => FunctionCall[] | undefined;
// Warning: (ae-incompatible-release-tags) The symbol "inferenceSource" is marked as @public, but its signature references "InferenceSource" which is marked as @beta
inferenceSource?: InferenceSource;
inlineDataParts: () => InlineDataPart[] | undefined;
text: () => string;
thoughtSummary: () => string | undefined;
Expand Down Expand Up @@ -816,6 +818,15 @@ export const InferenceMode: {
// @beta
export type InferenceMode = (typeof InferenceMode)[keyof typeof InferenceMode];

// @beta
export const InferenceSource: {
readonly ON_DEVICE: "on_device";
readonly IN_CLOUD: "in_cloud";
};

// @beta
export type InferenceSource = (typeof InferenceSource)[keyof typeof InferenceSource];

// @public
export interface InlineDataPart {
// (undocumented)
Expand Down
11 changes: 11 additions & 0 deletions docs-devsite/ai.enhancedgeneratecontentresponse.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ export interface EnhancedGenerateContentResponse extends GenerateContentResponse
| Property | Type | Description |
| --- | --- | --- |
| [functionCalls](./ai.enhancedgeneratecontentresponse.md#enhancedgeneratecontentresponsefunctioncalls) | () =&gt; [FunctionCall](./ai.functioncall.md#functioncall_interface)<!-- -->\[\] \| undefined | Aggregates and returns every [FunctionCall](./ai.functioncall.md#functioncall_interface) from the first candidate of [GenerateContentResponse](./ai.generatecontentresponse.md#generatecontentresponse_interface)<!-- -->. |
| [inferenceSource](./ai.enhancedgeneratecontentresponse.md#enhancedgeneratecontentresponseinferencesource) | [InferenceSource](./ai.md#inferencesource) | Indicates whether inference happened on-device or in-cloud. |
| [inlineDataParts](./ai.enhancedgeneratecontentresponse.md#enhancedgeneratecontentresponseinlinedataparts) | () =&gt; [InlineDataPart](./ai.inlinedatapart.md#inlinedatapart_interface)<!-- -->\[\] \| undefined | Aggregates and returns every [InlineDataPart](./ai.inlinedatapart.md#inlinedatapart_interface) from the first candidate of [GenerateContentResponse](./ai.generatecontentresponse.md#generatecontentresponse_interface)<!-- -->. |
| [text](./ai.enhancedgeneratecontentresponse.md#enhancedgeneratecontentresponsetext) | () =&gt; string | Returns the text string from the response, if available. Throws if the prompt or candidate was blocked. |
| [thoughtSummary](./ai.enhancedgeneratecontentresponse.md#enhancedgeneratecontentresponsethoughtsummary) | () =&gt; string \| undefined | Aggregates and returns every [TextPart](./ai.textpart.md#textpart_interface) with their <code>thought</code> property set to <code>true</code> from the first candidate of [GenerateContentResponse](./ai.generatecontentresponse.md#generatecontentresponse_interface)<!-- -->. |
Expand All @@ -38,6 +39,16 @@ Aggregates and returns every [FunctionCall](./ai.functioncall.md#functioncall_in
functionCalls: () => FunctionCall[] | undefined;
```

## EnhancedGenerateContentResponse.inferenceSource

Indicates whether inference happened on-device or in-cloud.

<b>Signature:</b>

```typescript
inferenceSource?: InferenceSource;
```

## EnhancedGenerateContentResponse.inlineDataParts

Aggregates and returns every [InlineDataPart](./ai.inlinedatapart.md#inlinedatapart_interface) from the first candidate of [GenerateContentResponse](./ai.generatecontentresponse.md#generatecontentresponse_interface)<!-- -->.
Expand Down
31 changes: 31 additions & 0 deletions docs-devsite/ai.md
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,7 @@ The Firebase AI Web SDK.
| [ImagenPersonFilterLevel](./ai.md#imagenpersonfilterlevel) | A filter level controlling whether generation of images containing people or faces is allowed.<!-- -->See the <a href="http://firebase.google.com/docs/vertex-ai/generate-images">personGeneration</a> documentation for more details. |
| [ImagenSafetyFilterLevel](./ai.md#imagensafetyfilterlevel) | A filter level controlling how aggressively to filter sensitive content.<!-- -->Text prompts provided as inputs and images (generated or uploaded) through Imagen on Vertex AI are assessed against a list of safety filters, which include 'harmful categories' (for example, <code>violence</code>, <code>sexual</code>, <code>derogatory</code>, and <code>toxic</code>). This filter level controls how aggressively to filter out potentially harmful content from responses. See the [documentation](http://firebase.google.com/docs/vertex-ai/generate-images) and the [Responsible AI and usage guidelines](https://cloud.google.com/vertex-ai/generative-ai/docs/image/responsible-ai-imagen#safety-filters) for more details. |
| [InferenceMode](./ai.md#inferencemode) | <b><i>(Public Preview)</i></b> Determines whether inference happens on-device or in-cloud. |
| [InferenceSource](./ai.md#inferencesource) | <b><i>(Public Preview)</i></b> Indicates whether inference happened on-device or in-cloud. |
| [Language](./ai.md#language) | <b><i>(Public Preview)</i></b> The programming language of the code. |
| [LiveResponseType](./ai.md#liveresponsetype) | <b><i>(Public Preview)</i></b> The types of responses that can be returned by [LiveSession.receive()](./ai.livesession.md#livesessionreceive)<!-- -->. |
| [Modality](./ai.md#modality) | Content part modality. |
Expand Down Expand Up @@ -189,6 +190,7 @@ The Firebase AI Web SDK.
| [ImagenPersonFilterLevel](./ai.md#imagenpersonfilterlevel) | A filter level controlling whether generation of images containing people or faces is allowed.<!-- -->See the <a href="http://firebase.google.com/docs/vertex-ai/generate-images">personGeneration</a> documentation for more details. |
| [ImagenSafetyFilterLevel](./ai.md#imagensafetyfilterlevel) | A filter level controlling how aggressively to filter sensitive content.<!-- -->Text prompts provided as inputs and images (generated or uploaded) through Imagen on Vertex AI are assessed against a list of safety filters, which include 'harmful categories' (for example, <code>violence</code>, <code>sexual</code>, <code>derogatory</code>, and <code>toxic</code>). This filter level controls how aggressively to filter out potentially harmful content from responses. See the [documentation](http://firebase.google.com/docs/vertex-ai/generate-images) and the [Responsible AI and usage guidelines](https://cloud.google.com/vertex-ai/generative-ai/docs/image/responsible-ai-imagen#safety-filters) for more details. |
| [InferenceMode](./ai.md#inferencemode) | <b><i>(Public Preview)</i></b> Determines whether inference happens on-device or in-cloud. |
| [InferenceSource](./ai.md#inferencesource) | <b><i>(Public Preview)</i></b> Indicates whether inference happened on-device or in-cloud. |
| [Language](./ai.md#language) | <b><i>(Public Preview)</i></b> The programming language of the code. |
| [LanguageModelMessageContentValue](./ai.md#languagemodelmessagecontentvalue) | <b><i>(Public Preview)</i></b> Content formats that can be provided as on-device message content. |
| [LanguageModelMessageRole](./ai.md#languagemodelmessagerole) | <b><i>(Public Preview)</i></b> Allowable roles for on-device language model usage. |
Expand Down Expand Up @@ -643,6 +645,22 @@ InferenceMode: {
}
```

## InferenceSource

> This API is provided as a preview for developers and may change based on feedback that we receive. Do not use this API in a production environment.
>

Indicates whether inference happened on-device or in-cloud.

<b>Signature:</b>

```typescript
InferenceSource: {
readonly ON_DEVICE: "on_device";
readonly IN_CLOUD: "in_cloud";
}
```

## Language

> This API is provided as a preview for developers and may change based on feedback that we receive. Do not use this API in a production environment.
Expand Down Expand Up @@ -926,6 +944,19 @@ Determines whether inference happens on-device or in-cloud.
export type InferenceMode = (typeof InferenceMode)[keyof typeof InferenceMode];
```

## InferenceSource

> This API is provided as a preview for developers and may change based on feedback that we receive. Do not use this API in a production environment.
>

Indicates whether inference happened on-device or in-cloud.

<b>Signature:</b>

```typescript
export type InferenceSource = (typeof InferenceSource)[keyof typeof InferenceSource];
```

## Language

> This API is provided as a preview for developers and may change based on feedback that we receive. Do not use this API in a production environment.
Expand Down
11 changes: 6 additions & 5 deletions packages/ai/src/methods/generate-content.ts
Original file line number Diff line number Diff line change
Expand Up @@ -57,14 +57,14 @@ export async function generateContentStream(
chromeAdapter?: ChromeAdapter,
requestOptions?: RequestOptions
): Promise<GenerateContentStreamResult> {
const response = await callCloudOrDevice(
const callResult = await callCloudOrDevice(
params,
chromeAdapter,
() => chromeAdapter!.generateContentStream(params),
() =>
generateContentStreamOnCloud(apiSettings, model, params, requestOptions)
);
return processStream(response, apiSettings); // TODO: Map streaming responses
return processStream(callResult.response, apiSettings); // TODO: Map streaming responses
}

async function generateContentOnCloud(
Expand Down Expand Up @@ -93,18 +93,19 @@ export async function generateContent(
chromeAdapter?: ChromeAdapter,
requestOptions?: RequestOptions
): Promise<GenerateContentResult> {
const response = await callCloudOrDevice(
const callResult = await callCloudOrDevice(
params,
chromeAdapter,
() => chromeAdapter!.generateContent(params),
() => generateContentOnCloud(apiSettings, model, params, requestOptions)
);
const generateContentResponse = await processGenerateContentResponse(
response,
callResult.response,
apiSettings
);
const enhancedResponse = createEnhancedContentResponse(
generateContentResponse
generateContentResponse,
callResult.inferenceSource
);
return {
response: enhancedResponse
Expand Down
28 changes: 20 additions & 8 deletions packages/ai/src/requests/hybrid-helpers.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,12 @@
import { use, expect } from 'chai';
import { SinonStub, SinonStubbedInstance, restore, stub } from 'sinon';
import { callCloudOrDevice } from './hybrid-helpers';
import { GenerateContentRequest, InferenceMode, AIErrorCode } from '../types';
import {
GenerateContentRequest,
InferenceMode,
AIErrorCode,
InferenceSource
} from '../types';
import { AIError } from '../errors';
import sinonChai from 'sinon-chai';
import chaiAsPromised from 'chai-as-promised';
Expand Down Expand Up @@ -58,7 +63,8 @@ describe('callCloudOrDevice', () => {
onDeviceCall,
inCloudCall
);
expect(result).to.equal('in-cloud-response');
expect(result.response).to.equal('in-cloud-response');
expect(result.inferenceSource).to.equal(InferenceSource.IN_CLOUD);
expect(inCloudCall).to.have.been.calledOnce;
expect(onDeviceCall).to.not.have.been.called;
});
Expand All @@ -76,7 +82,8 @@ describe('callCloudOrDevice', () => {
onDeviceCall,
inCloudCall
);
expect(result).to.equal('on-device-response');
expect(result.response).to.equal('on-device-response');
expect(result.inferenceSource).to.equal(InferenceSource.ON_DEVICE);
expect(onDeviceCall).to.have.been.calledOnce;
expect(inCloudCall).to.not.have.been.called;
});
Expand All @@ -89,7 +96,8 @@ describe('callCloudOrDevice', () => {
onDeviceCall,
inCloudCall
);
expect(result).to.equal('in-cloud-response');
expect(result.response).to.equal('in-cloud-response');
expect(result.inferenceSource).to.equal(InferenceSource.IN_CLOUD);
expect(inCloudCall).to.have.been.calledOnce;
expect(onDeviceCall).to.not.have.been.called;
});
Expand All @@ -108,7 +116,8 @@ describe('callCloudOrDevice', () => {
onDeviceCall,
inCloudCall
);
expect(result).to.equal('on-device-response');
expect(result.response).to.equal('on-device-response');
expect(result.inferenceSource).to.equal(InferenceSource.ON_DEVICE);
expect(onDeviceCall).to.have.been.calledOnce;
expect(inCloudCall).to.not.have.been.called;
});
Expand Down Expand Up @@ -136,7 +145,8 @@ describe('callCloudOrDevice', () => {
onDeviceCall,
inCloudCall
);
expect(result).to.equal('in-cloud-response');
expect(result.response).to.equal('in-cloud-response');
expect(result.inferenceSource).to.equal(InferenceSource.IN_CLOUD);
expect(inCloudCall).to.have.been.calledOnce;
expect(onDeviceCall).to.not.have.been.called;
});
Expand All @@ -154,7 +164,8 @@ describe('callCloudOrDevice', () => {
onDeviceCall,
inCloudCall
);
expect(result).to.equal('in-cloud-response');
expect(result.response).to.equal('in-cloud-response');
expect(result.inferenceSource).to.equal(InferenceSource.IN_CLOUD);
expect(inCloudCall).to.have.been.calledOnce;
expect(onDeviceCall).to.not.have.been.called;
});
Expand All @@ -169,7 +180,8 @@ describe('callCloudOrDevice', () => {
onDeviceCall,
inCloudCall
);
expect(result).to.equal('on-device-response');
expect(result.response).to.equal('on-device-response');
expect(result.inferenceSource).to.equal(InferenceSource.ON_DEVICE);
expect(inCloudCall).to.have.been.calledOnce;
expect(onDeviceCall).to.have.been.calledOnce;
});
Expand Down
45 changes: 36 additions & 9 deletions packages/ai/src/requests/hybrid-helpers.ts
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@ import {
GenerateContentRequest,
InferenceMode,
AIErrorCode,
ChromeAdapter
ChromeAdapter,
InferenceSource
} from '../types';
import { ChromeAdapterImpl } from '../methods/chrome-adapter';

Expand All @@ -33,6 +34,11 @@ const errorsCausingFallback: AIErrorCode[] = [
AIErrorCode.API_NOT_ENABLED
];

interface CallResult<Response> {
response: Response;
inferenceSource: InferenceSource;
}

/**
* Dispatches a request to the appropriate backend (on-device or in-cloud)
* based on the inference mode.
Expand All @@ -48,35 +54,56 @@ export async function callCloudOrDevice<Response>(
chromeAdapter: ChromeAdapter | undefined,
onDeviceCall: () => Promise<Response>,
inCloudCall: () => Promise<Response>
): Promise<Response> {
): Promise<CallResult<Response>> {
if (!chromeAdapter) {
return inCloudCall();
return {
response: await inCloudCall(),
inferenceSource: InferenceSource.IN_CLOUD
};
}
switch ((chromeAdapter as ChromeAdapterImpl).mode) {
case InferenceMode.ONLY_ON_DEVICE:
if (await chromeAdapter.isAvailable(request)) {
return onDeviceCall();
return {
response: await onDeviceCall(),
inferenceSource: InferenceSource.ON_DEVICE
};
}
throw new AIError(
AIErrorCode.UNSUPPORTED,
'Inference mode is ONLY_ON_DEVICE, but an on-device model is not available.'
);
case InferenceMode.ONLY_IN_CLOUD:
return inCloudCall();
return {
response: await inCloudCall(),
inferenceSource: InferenceSource.IN_CLOUD
};
case InferenceMode.PREFER_IN_CLOUD:
try {
return await inCloudCall();
return {
response: await inCloudCall(),
inferenceSource: InferenceSource.IN_CLOUD
};
} catch (e) {
if (e instanceof AIError && errorsCausingFallback.includes(e.code)) {
return onDeviceCall();
return {
response: await onDeviceCall(),
inferenceSource: InferenceSource.ON_DEVICE
};
}
throw e;
}
case InferenceMode.PREFER_ON_DEVICE:
if (await chromeAdapter.isAvailable(request)) {
return onDeviceCall();
return {
response: await onDeviceCall(),
inferenceSource: InferenceSource.ON_DEVICE
};
}
return inCloudCall();
return {
response: await inCloudCall(),
inferenceSource: InferenceSource.IN_CLOUD
};
default:
throw new AIError(
AIErrorCode.ERROR,
Expand Down
7 changes: 5 additions & 2 deletions packages/ai/src/requests/response-helpers.ts
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,8 @@ import {
ImagenInlineImage,
AIErrorCode,
InlineDataPart,
Part
Part,
InferenceSource
} from '../types';
import { AIError } from '../errors';
import { logger } from '../logger';
Expand Down Expand Up @@ -66,7 +67,8 @@ function hasValidCandidates(response: GenerateContentResponse): boolean {
* other modifications that improve usability.
*/
export function createEnhancedContentResponse(
response: GenerateContentResponse
response: GenerateContentResponse,
inferenceSource: InferenceSource = InferenceSource.IN_CLOUD
): EnhancedGenerateContentResponse {
/**
* The Vertex AI backend omits default values.
Expand All @@ -79,6 +81,7 @@ export function createEnhancedContentResponse(
}

const responseWithHelpers = addHelpers(response);
responseWithHelpers.inferenceSource = inferenceSource;
return responseWithHelpers;
}

Expand Down
Loading
Loading