Skip to content

Commit

Permalink
docs: Installation, license README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
Anush008 committed Oct 3, 2023
1 parent 4ebef86 commit 85d0af5
Showing 1 changed file with 18 additions and 5 deletions.
23 changes: 18 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,10 @@
</div>

## 🍕 Features
* Supports synchronous usage. No dependency on Tokio.
* Uses [@huggingface/tokenizers](https://github.com/huggingface/tokenizers) for blazing-fast encodings.
* Supports batch embedddings with parallelism using Rayon.

- Supports synchronous usage. No dependency on Tokio.
- Uses [@huggingface/tokenizers](https://github.com/huggingface/tokenizers) for blazing-fast encodings.
- Supports batch embedddings with parallelism using Rayon.

The default embedding supports "query" and "passage" prefixes for the input text. The default model is Flag Embedding, which is top of the [MTEB](https://huggingface.co/spaces/mteb/leaderboard) leaderboard.

Expand All @@ -20,15 +21,20 @@ The default embedding supports "query" and "passage" prefixes for the input text
- [**sentence-transformers/all-MiniLM-L12-v2**](https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2)
- [**intfloat/multilingual-e5-large**](https://huggingface.co/intfloat/multilingual-e5-large)


## 🚀 Installation

To install the FastEmbed library, Cargo works:
Run the following Cargo command in your project directory:

```bash
cargo add fastembed
```

Or add the following line to your Cargo.toml:

```toml
fastembed = "1"
```

## 📖 Usage

```rust
Expand Down Expand Up @@ -59,6 +65,7 @@ let documents = vec![
```

### Supports passage and query embeddings for more accurate results

```rust
// Generate embeddings for the passages
// The texts are prefixed with "passage" for better results
Expand Down Expand Up @@ -92,8 +99,14 @@ It's important we justify the "fast" in FastEmbed. FastEmbed is fast because:
2. ONNX Runtime which allows for inference on CPU, GPU, and other dedicated runtimes

### Why light?

1. No hidden dependencies via Huggingface Transformers

### Why accurate?

1. Better than OpenAI Ada-002
2. Top of the Embedding leaderboards e.g. [MTEB](https://huggingface.co/spaces/mteb/leaderboard)

## 📄 LICENSE

MIT © [2023](https://github.com/Anush008/fastembed-rs/blob/main/LICENSE)

0 comments on commit 85d0af5

Please sign in to comment.