Skip to content

Conversation

@jakmro
Copy link
Contributor

@jakmro jakmro commented Mar 27, 2025

Description

Add text embeddings

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update (improves or adds clarity to existing documentation)

Tested on

  • iOS
  • Android

Checklist

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings

@jakmro jakmro force-pushed the @jakmro/text-embeddings-ios branch 3 times, most recently from 4cf7634 to 11bff0f Compare March 28, 2025 11:50
@jakmro jakmro changed the title feat: text embeddings ios feat: text embeddings Mar 28, 2025
@jakmro jakmro force-pushed the @jakmro/text-embeddings-ios branch from 11bff0f to d34a6bd Compare March 28, 2025 11:51
@jakmro jakmro requested review from chmjkb and mkopcins March 31, 2025 10:39
@jakmro jakmro marked this pull request as ready for review March 31, 2025 10:39
fun preprocess(input: String): Array<LongArray> {
val inputIds = tokenizer.encode(input).map { it.toLong() }.toLongArray()
val attentionMask = inputIds.map { if (it != 0L) 1L else 0L }.toLongArray()
return arrayOf(inputIds, attentionMask) // Shape: [2, max_length]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is max_length specified? I think mentioning it here would be nice

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_length is specified inside tokenizer.json

Comment on lines 10 to 11
modelSource: string | number;
tokenizerSource: string | number;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use ResourceSource instead of string | number

Comment on lines 23 to 16
- (NSArray *)forwardMultiple:(NSArray *)inputs {
NSMutableArray *shapes = [NSMutableArray new];
NSMutableArray *inputTypes = [NSMutableArray new];
NSNumber *numberOfInputs = [module getNumberOfInputs];

for (NSUInteger i = 0; i < [numberOfInputs intValue]; i++) {
[shapes addObject:[module getInputShape:[NSNumber numberWithInt:i]]];
[inputTypes addObject:[module getInputType:[NSNumber numberWithInt:i]]];
}

NSArray *result = [module forward:inputs shapes:shapes inputTypes:inputTypes];

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not look correct, wouldn't it be better to actually check if length on inputs is the same as numberOfInputs and if not abort? Then we don't need to create another method just to copy 99% of the code. You can either add if sattement here checking if input is array of arrays or change it to only work with array of inputs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and then change the name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be changed in another PR as mentioned in BaseModel.h. This change requires changes in multiple files and I decided to not include them here.

Comment on lines 136 to 147
const TextEmbeddingsSpec = require('./NativeTextEmbeddings').default;
const TextEmbeddings = TextEmbeddingsSpec
? TextEmbeddingsSpec
: new Proxy(
{},
{
get() {
throw new Error(LINKING_ERROR);
},
}
);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add class for this like for other modules, e.g.

class _ClassificationModule {
  async forward(input: string): ReturnType<ClassificationInterface['forward']> {
    return await Classification.forward(input);
  }
  async loadModule(
    modelSource: string | number
  ): ReturnType<ClassificationInterface['loadModule']> {
    return await Classification.loadModule(modelSource);
  }
}

"devDependencies": {
"@babel/core": "^7.25.2",
"@types/react": "~18.3.12",
"react-native-executorch": "file:../../react-native-executorch-20250401133031.tgz",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be cleaned up once v0.4.0 will be released on npm

title: TextEmbeddingsModule
sidebar_position: 8
---

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add the description and the caution banner like in useTextEmbeddings page

@jakmro jakmro force-pushed the @jakmro/text-embeddings-ios branch 2 times, most recently from 06125c3 to a43a3ba Compare April 17, 2025 12:40
@pweglik pweglik mentioned this pull request Apr 22, 2025
10 tasks
@jakmro jakmro self-assigned this Apr 23, 2025
jakmro and others added 13 commits April 23, 2025 12:40
Co-authored-by: Mateusz Kopcinski <120639731+mkopcins@users.noreply.github.com>
Remove Llama Export, refactor docs

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [x] Documentation update (improves or adds clarity to existing
documentation)

- [x] I have performed a self-review of my code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have updated the documentation accordingly
- [x] My changes generate no new warnings

---------

Co-authored-by: Jakub Chmura <92989966+chmjkb@users.noreply.github.com>
Modified forward function inside BaseModel to accept multiple inputs.
Removed unnecessary includes/imports. Made the model loading
synchronous.

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] Documentation update (improves or adds clarity to existing
documentation)

- [x] iOS
- [ ] Android

- [x] I have performed a self-review of my code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have updated the documentation accordingly
- [x] My changes generate no new warnings
jakmro and others added 3 commits April 23, 2025 12:45
Code refactor (example)

- [x] iOS
- [x] Android

- [x] I have performed a self-review of my code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have updated the documentation accordingly
- [x] My changes generate no new warnings

---------

Co-authored-by: Jakub Chmura <92989966+chmjkb@users.noreply.github.com>
Add tokenizer documentation

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [x] Documentation update (improves or adds clarity to existing
documentation)

- [x] I have performed a self-review of my code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have updated the documentation accordingly
- [x] My changes generate no new warnings

---------

Co-authored-by: kopcion <mati3111@gmail.com>
@jakmro jakmro force-pushed the @jakmro/text-embeddings-ios branch from f0e49d6 to 568007b Compare April 23, 2025 11:04
@jakmro jakmro merged commit deb768a into v0.4.0-rc1 Apr 23, 2025
1 check passed
@jakmro jakmro deleted the @jakmro/text-embeddings-ios branch April 23, 2025 15:36
mkopcins added a commit that referenced this pull request Oct 15, 2025
## Description
Add text embeddings

### Type of change

- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [x] Documentation update (improves or adds clarity to existing
documentation)

### Tested on

- [x] iOS
- [x] Android

### Checklist

- [x] I have performed a self-review of my code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have updated the documentation accordingly
- [x] My changes generate no new warnings

---------

Co-authored-by: Mateusz Kopcinski <120639731+mkopcins@users.noreply.github.com>
Co-authored-by: Jakub Chmura <92989966+chmjkb@users.noreply.github.com>
Co-authored-by: kopcion <mati3111@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants