Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does newline negatively impact embedding performance? #418

Closed
ravwojdyla opened this issue Apr 27, 2023 · 7 comments
Closed

Why does newline negatively impact embedding performance? #418

ravwojdyla opened this issue Apr 27, 2023 · 7 comments
Labels
question Further information is requested

Comments

@ravwojdyla
Copy link
Contributor

Describe the bug

While reading the code of the embeddings_utils I have stumbled upon this:

# replace newlines, which can negatively affect performance.
text = text.replace("\n", " ")

Could you please provide more context on:

replace newlines, which can negatively affect performance.

Are there any references/papers/numbers behind that negative impact?

To Reproduce

get_embedding("foo bar\nbaz")

Code snippets

No response

OS

macOS

Python version

3.10

Library version

0.27.4

@ravwojdyla ravwojdyla added the bug Something isn't working label Apr 27, 2023
@ravwojdyla
Copy link
Contributor Author

ravwojdyla commented Apr 27, 2023

Sorry, maybe this should not be marked with bug, tho the user may not expect that the text is actually preprocessed (new lines removed). This could potentially be confusing if they fingerprint text and compute embeddings using this lib and something else.

@ravwojdyla
Copy link
Contributor Author

Git blame points me to @BorisPower, would appreciate your input please.

@BorisPower
Copy link
Collaborator

This used to be a problem due to how the v1 embeddings were trained. However this was addressed in v2, and v2 should not have a problem with newlines!

@ravwojdyla
Copy link
Contributor Author

@BorisPower ah, great, thanks for a prompt response, will remove that in a PR then

@ravwojdyla
Copy link
Contributor Author

ravwojdyla commented Apr 27, 2023

@BorisPower submitted a PR #419, out of curiosity do you happen to have reference(s) that explain why V1 had issues with new lines?

Edit: I would like to include that in the code as reference for posterity ^

hwchase17 pushed a commit to langchain-ai/langchain that referenced this issue May 2, 2023
Only 1st generation OpenAI embeddings models are negatively impacted by
new lines.

Context:
openai/openai-python#418 (comment)
@EliahKagan
Copy link

EliahKagan commented Jul 12, 2023

Although this doesn't explain the reason, there's a more specific statement of which models perform better without newlines in the embeddings documentation:

With the -001 text embeddings (not -002, and not code embeddings), we suggest replacing newlines (\n) in your input with a single space, as we have seen worse results when newlines are present.

The first-generation code embedding models are those that end in -code-001; of the first-generation models, those are the ones whose input is intended to be code rather than natural-language text. The models that statement refers to as "-001 text embeddings" are the other first-generation models. (Those all end in -001, but they don't all end in -text-001.)

Unfortunately, the above statement hard to link to, so there might not be any good way to include a comment linking to it in the code. It appears at the bottom of the "First generation models (not recommended)" section, which itself can't be linked to, and which is collapsed by default. It appears immediately under the Code search embeddings subsection, but while the page offers that link, that link does not currently navigate to that section of the page.

That embeddings_utils.get_embedding and the three closely related functions currently replace newlines even when a non-first-generation model is specified, and that they currently replace newlines even when a first-generation code model is specified, are arguably two separate bugs. But they can be fixed together. I am also inclined to think this issue could cover both. (However, you think it's best to have a separate issue for -code-001 behavior, I can go ahead and open one.)

yisding added a commit to run-llama/LlamaIndexTS that referenced this issue Aug 30, 2023
@RobertCraigie RobertCraigie added question Further information is requested and removed bug Something isn't working labels Nov 6, 2023
@RobertCraigie
Copy link
Collaborator

I'm going to go ahead and close this as the embeddings utils file has been removed in v1 of the SDK.

Please ask any API questions in the forum! https://community.openai.com/

@RobertCraigie RobertCraigie closed this as not planned Won't fix, can't repro, duplicate, stale Nov 6, 2023
Knordy added a commit to Knordy/langchainjs that referenced this issue Dec 11, 2023
…nai` embeddings. As removing newlines was beneficial for `V1` models (or `-001`), but should not be mandatory for `V2` models (or `-002`). This is explained in openai/openai-python#418 (comment)

Therefor updating this field to be in line with the default set model `text-embedding-ada-002`.

Also the langchain python library only enables this for `-001` models: https://github.com/langchain-ai/langchain/blob/c0f4b95aa9961724ab4569049b4c3bc12ebbacfc/libs/langchain/langchain/embeddings/openai.py#L466
jacoblee93 pushed a commit to langchain-ai/langchainjs that referenced this issue Dec 14, 2023
* Updating the default value for `stripNewLines` to `false` in the `openai` embeddings. As removing newlines was beneficial for `V1` models (or `-001`), but should not be mandatory for `V2` models (or `-002`). This is explained in openai/openai-python#418 (comment)

Therefor updating this field to be in line with the default set model `text-embedding-ada-002`.

Also the langchain python library only enables this for `-001` models: https://github.com/langchain-ai/langchain/blob/c0f4b95aa9961724ab4569049b4c3bc12ebbacfc/libs/langchain/langchain/embeddings/openai.py#L466

* Reverting the default value, so it's `false` by default again.
Marked with a comment to indicate this should be changed in a future minor release.
Referenced the PR, as it contains the necessary information as to why this should be updated.

* Resolving conflicts, adding changes again to new location.
ericckzhou added a commit to Ultrahi/langchainjs that referenced this issue Dec 21, 2023
* Release 0.0.193

* Pin zod-to-json-schema version (langchain-ai#3343)

* Release 0.0.194

* Fix ReAct agent hallucinating result (langchain-ai#3341)

* Adding self query for vectara (langchain-ai#3338)

* added self query for vectara vector store

* updated the docs

* skip the integration test

* Updated the comments in the example

* Rename test, add linter warning

---------

Co-authored-by: Adeel Ehsan <adeel.ehsan@wellthy.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* Updates to Vectara Implementation (langchain-ai#3332)

* updated documentation
added X-Source to header

* added deleteDocuments() method
updated generation of random ID from date to UUID-like
tests now fully executed and fixed to work properly

* added deleteDocuments to docs
fixes from yarn lint
keeping test.skip

* Removed **sentence-transformers/distilbert-base-nli-mean-tokens** as default model and added **BAAI/bge-base-en-v1.5** as default model when no model param is given. (langchain-ai#3323)

* Update hf.ts

Removed sentence-transformers/distilbert-base-nli-mean-tokens as default model and added BAAI/bge-base-en-v1.5 as default model when no model param is given

* Format

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* feat: add filters to `ChaindeskRetriever` (langchain-ai#3314)

* feat: add filters to `ChaindeskRetriver`

reference: https://docs.chaindesk.ai/api-reference/endpoint/datastores/query

* rename filter to filters

* run yarn format

* Add OpenAI Files for OpenAI assistant (langchain-ai#3228)

* Add File to Open Ai assistant

* 📝 Add documentation

* ✨ Add Open File API

* 📝 Add documentation on Open AI File API

* ✅ Add test on Open AI File API

* Update jsdoc types for params

* Fix openai request options import

* Extend serializable class and add return jsdoc types

* 🔧 Add experimental openai_files entrypoint.

* 📝 Build the doc

* ♻️ Refactor OpenAIFiles to allow custom client

* 📝 clean the JSDoc

* Use one documentation page

---------

Co-authored-by: Brace Sproul <braceasproul@gmail.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* Release 0.0.195

* Jacob/core (langchain-ai#3354)

* Split out core

* Update package lock and yarn lock

* Update core

* Fix build

* Fix tests

* Fix test, format

* Fix format

* Fix format

* Scripts

* Bump dep

* Fix examples

* Remove unneeded deps

* Update deps

* Fix exports plz

* plz

* Fix plz

* Fix

* Plox

* Fix

* Disable bun

* Disable API refs for now

* Fix build

* PLZZZZ

* Bump

* Bump version

* Skip test

* Release 0.0.196

* Refactor core (langchain-ai#3373)

* Refactor core

* Bump core version

* Bump core

* Update OpenAIAgent to support Runnable models (langchain-ai#3346)

* Update OpenAIAgent to support Runnable interface

* Add test with executor

* Call invoke for all paths and add CallOptions

* Format and fix test

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* Fix streaming of Bedrock on Cloudflare Workers (langchain-ai#3364)

* Fix streaming of Bedrock on Cloudflare Workers

* Handle buffer 0

* Remove unnecessary break

* Update quickstart.mdx (langchain-ai#3357)

* Skip non LTS Node version (langchain-ai#3374)

* Allow to stream files with `GithubRepoLoader` (langchain-ai#3339)

* add loadAsStream method in GithubRepoLoader

* apply review changes

* feat: implement max marginal relevance for momento vector index (langchain-ai#3351)

* chore: bump momento deps to get searchAndFetchVectors for MMR

* feat: implement max marginal relevance search for mvi and int tests

* fix: MongoDB Vector Search does not support integer as input (langchain-ai#3356)

* fix: MongoDB Vector Search does not support integer as input

* Fix lint + format

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* Move more core modules (langchain-ai#3376)

* Move more core modules

* Bump

* Format

* Adds standalone @langchain/anthropic package (langchain-ai#3377)

* Adds standalone @langchain/anthropic package

* Format

* Fix docker

* Fix Docker

* Support both old and new serialization ids, update prompt base class … (langchain-ai#3378)

* Support both old and new serialization ids, update prompt base class as example

* More namespace changes

* Update lc_namespaces

* Fix test

* Fix tests

* Bump versions

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* Rename to @langchain/core (langchain-ai#3381)

* Rename to @langchain/core

* Fix build

* Fix script

* Fixed docs build issue (langchain-ai#3382)

* Fixed docs build issue

* chore: lint files

* cr

* docs fix

* format

* cr

* Agent streaming (langchain-ai#3365)

* Agent streaming

* chore: lint files

* fix circular dep issue

* error handling

* cr

* cr

* fix any eslint

* rm commented out method

* Update langchain/src/agents/executor.ts

* Update langchain/src/agents/executor.ts

* cr

* drop husky (langchain-ai#3383)

* Add OpenAI package (langchain-ai#3385)

* Serialization

* Add OpenAI package

* Fix small core serialization issue (langchain-ai#3386)

* Fix small core serialization issue

* Bump

* rc.1

* Add core README (langchain-ai#3396)

* Revert dependencies for now (langchain-ai#3402)

* Revert dependencies for now

* Readd missing file

* Delegate to core

* Import map

* Remove workspace core dep (langchain-ai#3413)

* Update secret map (langchain-ai#3414)

* Version

* Secret map

* Release 0.0.197

* feat: add support for collection name in PG Vector (langchain-ai#3353)

* add support for collection name in PGVector

* lint

* add optional metadata as well

* format

* fix: Add HuggingFaceInference includeCredentials param (langchain-ai#3389)

* feat: Add HfInference includeCredentials prop

* Type fix

---------

Co-authored-by: Alex Naymushin <alexander.naymushin@omnigon.com>

* Use less tokens to describe a Neo4j graph schema (langchain-ai#3411)

* WIP

* Add proper typing

* Update inline snapshot to demonstrate what the new schema looks like

---------

Co-authored-by: Oskar Hane <oh@oskarhane.com>

* Add no focused jest tests eslint rule (langchain-ai#3422)

* Add no focused jest tests eslint rule

* unfocus test

* remove

* Add more tests for agent streaming (langchain-ai#3421)

* Add more tests for agent streaming

* add memory

* unfocus test

* chore: lint files

* cr

* chore: lint files

* Update langchain/src/agents/tests/agent.int.test.ts

* Added PyInterpreterTool (langchain-ai#3090)

* Added PyInterpreterTool

* Updated PyInterpreterTool

* Moved PythonInterpreterTool to tools/experimental

* Move to experimental, fix race condition in constructor

* Fix lint + test

* Adds docs + example

---------

Co-authored-by: Mish Ushakov <mishushakov@users.noreply.github.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* PPTX document loader (langchain-ai#3333)

* Add Powerpoint Loader

Added a powerpoint loader for pptx and unit test.

* Add pptx loader example

* Add documentation

* Resolve the problems in the comment

Move "officeparser" to the peer + dev dependencies and mark it as optional.
Create a separate enrtypoint and reformat the code

* Run build

* Reformat the code again

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* ✨ Add modify and retrieve openAI assistant (langchain-ai#3387)

* Return null if the input is undefined value. (langchain-ai#3412)

* Implement ClickHouse Support (langchain-ai#3342)

* Implement ClickHouse Support

Co-Authored-By: CalebZhang <64758307+kz4ever@users.noreply.github.com>
Co-Authored-By: Divyansh Kachchhava <114203084+divyanshuoft@users.noreply.github.com>
Co-Authored-By: Alfred Tze-Hong Ha <47645447+xxlalfredo99@users.noreply.github.com>

* Update ClickHouse client dependency

Co-Authored-By: Divyansh Kachchhava <114203084+divyanshuoft@users.noreply.github.com>
Co-Authored-By: CalebZhang <64758307+kz4ever@users.noreply.github.com>
Co-Authored-By: Alfred Tze-Hong Ha <47645447+xxlalfredo99@users.noreply.github.com>

* Update int test to use test.skip()

* Fix SQL injection risk

* Add peer deps docs

* fix yarn lint issue

---------

Co-authored-by: CalebZhang <64758307+kz4ever@users.noreply.github.com>
Co-authored-by: Divyansh Kachchhava <114203084+divyanshuoft@users.noreply.github.com>
Co-authored-by: Alfred Tze-Hong Ha <47645447+xxlalfredo99@users.noreply.github.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
Co-authored-by: CalebZhang <kevineview@gmail.com>

* integration[patch]: feat: implement max marginal relevance search for Weaviate vector store (langchain-ai#3395)

* feat: implement max marginal relevance search for Weaviate vector store

* formatting

* Adds docs

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* integration[minor]: Llama Cpp streaming (langchain-ai#3394)

* Got streaming working in LLM & Chat

* Linted streaming and added docs.

* Small fixes

* Update llama_cpp.mdx

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* Fix getBufferString method (langchain-ai#3423)

* Test fix

* Release 0.0.198

* Make undefined input optional (langchain-ai#3436)

* fix(npm script): lint fix doesn't fix fixable errors (langchain-ai#3451)

* fix(npm script): lint fix doesn't fix fixable errors

* Update langchain-core lint scripts

* Update langchain-openai lint scripts

* Update langchain-anthropic lint scripts

---------

Co-authored-by: Brace Sproul <braceasproul@gmail.com>

* Fix null or undefined records causing error in Xata similarity search method (langchain-ai#3425)

* fix: records null or undefined causing error in xata similarity search method

* fix: remove unnecessary line change

* fix: bad type error

* Format

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* new: returning generated question (langchain-ai#3433)

* export InitializeAgentExecutorOptionsStructured (langchain-ai#3442)

* langchain[patch]: Fix for Prisma vectorstore build query IN filter (langchain-ai#3462)

* Fixing an issue, where the `IN` query filter created an incorrect value syntax by joining the values as a single value. By using the Prisma join, the syntax is corrected again.

* Format

---------

Co-authored-by: Brace Sproul <braceasproul@gmail.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* langchain[patch]: Implements support for Personal Access Token Authentication in the ConfluenceLoader (langchain-ai#3409)

* update: optional personalAccessToken parameter

Add optional personalAccessToken parameter and also making username and accessToken optional
Update logic to either use personalAccessToken or username + password

* update: example to include personalAccessToken

* add: env examples

* fix: nit

langchain-ai#3409 (comment)
langchain-ai#3409 (comment)

* Update examples/src/document_loaders/confluence.ts

Co-authored-by: Brace Sproul <braceasproul@gmail.com>

* fix: initialize as undefined

* update: allow for no authorization

Allow for no authorization and skip auth header to allow accessing public spaces

* add: get keyword

* Format

* Remove unnecessary default

* Format

* Fix build

---------

Co-authored-by: Marcus Nätteldal <marcus.natteldal@ltu.se>
Co-authored-by: Brace Sproul <braceasproul@gmail.com>
Co-authored-by: Jacob Lee <jacoblee93@gmail.com>

* langchain[minor]: Added multi-message streaming to llama_cpp (langchain-ai#3463)

* Added multi message streaming to llama_cpp

* Format

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* langchain[patch]: Enhance filter functionality for Elasticsearch VectorStore (langchain-ai#3349)

* adapted buildMetadataTerms method

* backwards-compatibility

* ran yarn lint

* langchain[minor]: feat(LLM Integration): WatsonX AI Integration (langchain-ai#3399)

* base watsonx file

* base setup route

* build

* add project id lc secret

* WatsonX AI Functional MVP

* enable custom model parameters

* enable custom model selection with llama 2 default

* wrap fetch in `this.caller.call`

* run format

* add request error handling

* update typedoc string

* yarn format

* add watsonx ai example

* watsonx-ai -> watsonx_ai

* Add watson x documentation

* delete old files

* yarn format

* add error for missing project id

* Add setup note about secrets to docs

* format

* remove redundant count

* add var set on class instantiation example

* format

* update ibm cloud api key to follow common convention

* fix type cast

* update llmType casing

* add iam token caching

* use expiration field

* Small style fixes in dos

* Update watsonx_ai.ts

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* langchain[minor]: feat(LLM Integration): Gradient AI Integration (langchain-ai#3461)

* Create gradient_ai.ts

* Update package.json

* Update create-entrypoints.js

* temp removal of missing import

* yarn build

* format

* update naming and add initial call

* Functional mvp

* add caller wrap

* format

* fix call wrapper

* enable a single baseModel set

* add example

* use accessToken and workspaceId if set

* Create doc page

* format and lint

* format example

* update types

* update type string

* style fixes in docs

* Update gradient_ai.mdx

* Update gradient_ai.mdx

* Update gradient_ai.mdx

* Update gradient_ai.ts

* Rename to match Python

---------

Co-authored-by: Jacob Lee <jacoblee93@gmail.com>

* Add GooglePlaces Tool (langchain-ai#3400)

* added basic functioanlity of the google places api tool

* added google api tool to the index.ts

* Test cases for Google Places API and improved formatting

* Finishing google places tool

* Fixed bugs with google places tool files
* Fixed bugs with integration tests
* Added example usage file
* Added documentation
* Configured entry points

* Rename, small updates

* Use headers and body in Places API request

* Update build refs

* Format

* Remove artifact

---------

Co-authored-by: Yuto Omachi <youomachi@gmail.com>
Co-authored-by: Nandhakishore K.S <n.krishnamurthy@mail.utoronto.ca>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
Co-authored-by: Brace Sproul <braceasproul@gmail.com>

* langchain[minor]: Add document loader for ChatGPT data (langchain-ai#3439)

* copy pasting and basic error fixing

* fixed testing issue

- fixed issue
- fixed code
- added blob tests
- fixed test timestamps

* Update chatgpt.mdx

* Update chatgpt.mdx

* Update chatgpt.ts

* Throws errors also package.json and .gitignore update

- console error logging also throws error instead of not doing that
- put chatgpt.ts related files into package.json and .gitignore
- ran `yarn lint` and `yarn format` many times to be sure

* Format

* whoops one more

* Fix test

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* Fix lint (langchain-ai#3466)

* langchain[patch]: Add missing entrypoint (langchain-ai#3467)

* Add missing entrypoint

* Mark as optional

* Remove double export (langchain-ai#3469)

* core[patch]: Move tests into core (langchain-ai#3450)

* Move tests into core

* export texting utils from core

* chore: lint files

* Bump version

* Sort entrypoints

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* langchain[patch]: Move tests (langchain-ai#3470)

* Move tests

* Format

* Release 0.0.199

* remove run reference (langchain-ai#3481)

* core[patch]: move more tests & test utils to core (langchain-ai#3483)

* core[patch]: move more tests & test utils to core

* cr

* cr

* Bump core (langchain-ai#3486)

* langchain[chore]: Remove duplicated code (langchain-ai#3487)

* langchain[patch]: Remove duplicated code

* yarn install

* cr

* reinstall

* reinstall

* Adds core tests to CI (langchain-ai#3489)

* Adds missing export (langchain-ai#3490)

* Catch tiktoken errors (langchain-ai#3491)

* langchain[patch]: onToken event added in ChatLlamaCpp call function (langchain-ai#3443)

* onToken added in lama cpp

* package update

* repository username

* Drop _ in _options

* onToken in llama_cpp llm

---------

Co-authored-by: Brace Sproul <braceasproul@gmail.com>

* Fix build (langchain-ai#3492)

* core[minor]: Runnable with message history (langchain-ai#3437)

* Runnable with message history

* cr

* cr

* adds withListeners method to runnables/callbacks

* added entrypoint for root listener file

* cr

* cr

* cr

* cr

* cr

* support async listeners

* allow for run or run and config as args to listener funcs

* cr

* chore: lint files

* cr

* cr

* eslint disbale any

* update types

* cr

* cr

* cr

* cr

* cr

* cr

* Style

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* langchain[patch]: feat(Gradient LLM Integration): Add fine-tuned adapter inference support (langchain-ai#3471)

* Add adapterId option

* rename base example

* add adapter example

* update gradient docs to include adapter stuff

* remove run call from gradient llm inference examples

* langchain[minor]: feat(embedding integration): Gradient AI (langchain-ai#3475)

* initial gradient embeddings implementation

* format

* remove modelslug

* update package and entrypoint -> yarn build

* map texts and change response reading

* add example

* add `caller.call` wrapper

* add docs

* remove run form example

* Update gradient_ai.mdx

---------

Co-authored-by: Brace Sproul <braceasproul@gmail.com>
Co-authored-by: Jacob Lee <jacoblee93@gmail.com>

* multi[patch]: Bump core deps (langchain-ai#3495)

* Bump core

* Bump core dep

* Fix bug (langchain-ai#3496)

* langchain[patch]: Miscellaneous test fixes (langchain-ai#3497)

* Fix bug

* Small fixes

* Release 0.0.200

* Added extra chat message class to types for history (langchain-ai#3510)

* Added extra chat message class to types for history

* chore: lint files

* remove run from watsonx ai example (langchain-ai#3503)

* core[patch]: Export Runnable history (langchain-ai#3514)

* initial docs (langchain-ai#3493)

* core[fix]: RunnableFunc config types (langchain-ai#3513)

* core[fix]: RunnableFunc config types

* cr

* chore: lint files

* Bump core version (langchain-ai#3515)

* Adds npx create-langchain-integration command (langchain-ai#3512)

* Adds npx create-langchain-integration command

* Format

* all[chore]: Use turbo repo to build api refs, docs and more (langchain-ai#3511)

* api_refs[chore]: use turbo repo to build api refs and other dependencies

* update core docs to also use turbo

* cr

* use turbo in ci

* cr

* chore: lint files

* cr

* fix: query parameters are not passed correctly to WolframAlpha API (langchain-ai#3502)

* fix: query parameters are not passed correctly to WolframAlpha API

* Lint + format

* Fix test

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* Fix linter warning (langchain-ai#3528)

* core[patch]: Reducing heap area consumption regardless of the number of prompts (langchain-ai#3519)

* Remove unused option

* Cache the Tiktoken object

* Fix format

* Bump core version

* Upgrade to js-tiktoken@1.0.8

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
Co-authored-by: Tat Dat Duong <david@duong.cz>

* cr (langchain-ai#3536)

* core[tests]: Better tests for runnable history (langchain-ai#3537)

* core[tests]: Better tests for runnable history

* cr

* docs[patch]: search experiment (langchain-ai#3538)

* docs[patch]: search experiment

* lockfile change

* Add Gmail Tool (langchain-ai#3438)

* Add GmailBaseTool

* Add GmailGetMessage

* Fix eslint formatting errors

* fix: _call returns stringified output now

* feat: create gmail draft message complete

* fix: remove unused parameter

* fix: removed unused import statement

* fix: reformatted file and made method private

* Add GmailGetThread

* Fixes formatting issues

* Fix _call error

* Add GmailSearch

* Fix build error on types

* Create sent_message.ts

* Update sent_message.ts

run the prettier to format the document

* Update sent_message.ts

combine the sendMessage function into _call function

* Move gmail object from children to parent GmailBaseTool

* Fix formatting in gmail/base.ts

* fix: switched to Buffer class for base64 encode

* Make fields optional and use defaults properly in constructor

Previously the default values weren't being used in the constructor, this commit fixes that.

Also fields are now optional in each of the gmail tool constructors since they have defaults
as backups anyways

* Use Zod to parse input of GmailBaseTool constructor

* Update zod schema to be entirely optional for GmailBaseToolParams

* Create docs for Gmail Tool

* Add comment for default parameters, fix formatting

* Remove model from default parameters comment

* Add relavent tools in gmail example

* Add index.ts for all exports and rename send_message

* Add unit tests for gmail tools

* Add gmail type definitions to package.json

* Update typedoc.json

add gmail to typedoc.json

* Update create-entrypoints.js

add the entrypoints for gmail tool

* add description for our function

add example on our description

* update .gitignore

* fix the entrypoint

* change order

* change the zod

* fix the format

* Update base.ts

fix lint problem

* Update base.ts

remove the unuse comment

* add description for search

* fix: gmail tools extend structured tool

* Update descriptions.ts

* fix: tree shaking issues with zod fixed

* fix: prettier formatting

* Add zod back to GmailBaseTool

* Fix gmail example to work for StructuredTool

* Add gmail API key setup instructions in docs

* Fix formatting

* Fix formatting

* Replace .call with .invoke in gmail example

* Update gmail.ts

---------

Co-authored-by: Hamoon Zamiri <hamoon.zamiri@mail.utoronto.ca>
Co-authored-by: saeedahsan <ahsan02@gmail.com>
Co-authored-by: SeannnX <122410542+SeannnX@users.noreply.github.com>
Co-authored-by: Hamoon <90403097+HamoonZamiri@users.noreply.github.com>
Co-authored-by: Ahsan Saeed <ahsanm.saeed@mail.utoronto.ca>
Co-authored-by: Jacob Lee <jacoblee93@gmail.com>

* feat: Add ObsidianLoader to Document loaders (langchain-ai#3494)

* Add ObsidianLoader integration

* Fix Notion test not to consider Obsidian '.md' files

* Fix lint

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* all[patch]: Add .turbo to files removed by yarn clean (langchain-ai#3540)

* Add Connery Tool and Toolkit (langchain-ai#3499)

* Add ConneryApiClient class

* Intermediate sate

* Fix ConneryToolkit typo and update ConneryService
method name

* Init docs

* Update docs

* Fix imports in docs

* Create entry points for the tool and toolkit

* Fix the docs issue

* Bump core

* Release 0.0.201

* Fix OpenAI agent docs (langchain-ai#3543)

* Update contributing guidelines (langchain-ai#3550)

* core[infra]: Adds turbo to core (langchain-ai#3551)

* docs[patch]: Agent pointer (langchain-ai#3549)

* Adds pointer to OpenAI functions agent

* Adds pointer to OpenAI functions agent

---------

Co-authored-by: Brace Sproul <braceasproul@gmail.com>

* Revert "all[chore]: Use turbo repo to build api refs, docs and more" (langchain-ai#3535)

* Revert "all[chore]: Use turbo repo to build api refs, docs and more (langchain-ai#3511)"

This reverts commit f289f3d.

* cr

* Upgrade xata client to 0.28.0 and apply required change (langchain-ai#3553)

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* core[docs]: Docs for with listeners runnable method (langchain-ai#3531)

* core[docs]: Docs for with listeners runnable method

* chore: lint files

* cr

* langchain[docs]: agent stream example docs (langchain-ai#3384)

* agent stream example docs

* cr

* Fixed agent

* chore: lint files

* Added extra chat message class to types for history

* cr

* cr

* cr

* cr

* cr

* chore: yarn prettier

* cr

* Improve interaction with streams and Node's Readable.from() method (langchain-ai#3556)

* core[patch]: Simplify RunnableSequence transform implementation (langchain-ai#3558)

* Simplify RunnableSequence transform implementation

* Fix test

* Fix tracing tags

* Bump core

* Release 0.0.202

* chore: Upgrade Typescript to 5.1 (langchain-ai#3562)

* upgrade typescript to 5.1

* fix some type errors

---------

Co-authored-by: David Illing <dilling123@gmail.com>

* core[docs]: Docs & example for runnable history (langchain-ai#3527)

* core[docs]: Docs & example for runnable history

* cr

* cr

* cr

* chore: yarn prettier

* cr

* cr

* cr

* chore: lint files

* cr

* core[chore]: widen semver range for langsmith (langchain-ai#3564)

* core[chore]: widen semver range for langsmith

* Update lockfile

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* docs[patch]: Use an example file for docs code examples (langchain-ai#3567)

* docs[patch]: Use an example file for docs code examples

* cr

* cr

* cr

* Update default PDF spliter (langchain-ai#3568)

* Bump core (langchain-ai#3572)

* langchain[minor]: Experimental Masking Module (langchain-ai#3548)

* [Feature] Implementation of experimental masking parser/transformer

* test: add perf unit test

* fix: rename piitransformer to regextransformer

* added example Kitchen Sink for masking parser

* docs: Add documentation, nextjs example and kitchen sink example

* fix: wording

* docs: add basic example

* fix: remove comment and return stream

* feat: async hooks, immutable parser state

* fix: parse -> mask

* fix: || -> ??

* Fix lint, style

* Fix build

* Update mask.mdx

---------

Co-authored-by: Dzmitry Dubarau <dzmitry.dubarau@ally.com>
Co-authored-by: Dzmitry A Dubarau <dzmitry.dubarau@gmail.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* Update build artifacts

* Release 0.0.203

* Fix docs (langchain-ai#3584)

* Fix typo in example (langchain-ai#3585)

* core[docs]: Added get started page to LCEL (langchain-ai#3571)

* core[docs]: Added get started page to LCEL

* chore: lint files

* cr

* cr

* community[major]: Merge community (langchain-ai#3610)

* langchain[major]: LangChain community (langchain-ai#3581)

* Initial langchain-community commit

* Move LLMs

* Add tools

* Add more integrations

* Lint, format build

* More refactoring

* Build fixes

* Update lockfile

* Fix docs (langchain-ai#3584)

* Fix typo in example (langchain-ai#3585)

* core[docs]: Added get started page to LCEL (langchain-ai#3571)

* core[docs]: Added get started page to LCEL

* chore: lint files

* cr

* cr

* Format, lint

* Move more modules

* Move more modules, fix build

* Move more testsg

* Revert serialization changes

* Use OpenAI package

* Remove unused file

* Format

* Fix build

* Sync core

* Fix build command

---------

Co-authored-by: Brace Sproul <braceasproul@gmail.com>

* Move memory vector store back into langchain

* Move memory

* Move toolkits

* Move more tools and toolkits

* Update yarn lock

* Fix lint

* Move test

* Brace/add missing neo4j test (langchain-ai#3597)

* proper[minor]: Add back missing Neo4j int test

* chore: lint files

* community[minor]: Fix CI (langchain-ai#3601)

* Fix CI

* Format

* Fixes

* Fix

* Another try

* Fix typo

* Brace/cleanup deps (langchain-ai#3600)

* proper[major]: Cleaned up deps in langchain

* cr

* cr

* cr

* cr

* cr

* cr

* cr

* Add skeleton of import tests

* comminity[major]: Cleaned deps (langchain-ai#3602)

* Bump version

* Remove unused types

* Use different import maps for core vs main langchain (langchain-ai#3617)

* Use different import maps for core vs main langchain

* Expand serialization test to include more expected entrypoints

---------

Co-authored-by: Brace Sproul <braceasproul@gmail.com>

* Bump subpackage versions

* Fix linter warnings

* langchain[patch]: Build deps before testing (langchain-ai#3618)

* Release 0.0.204

* all[patch]: Ensure other subpackages are built before test/build (langchain-ai#3624)

* Ensure other subpackages are built before test/build

* Fix test

* Fix API ref build

* Fix docs build

* Fix build

* Fix build

* Build serially

* Fix build

* Bump core

* Release 0.0.205

* Add community as workspace dep to examples (langchain-ai#3633)

* api_refs[major]: Remove script, use custom typedoc plugin (langchain-ai#3630)

* api_refs[major]: Remove script, use custom typedoc plugin

* cr

* cr

* rm typedoc dep

* force no cache

* cr

* cr

* add custom build:vercel scripts

* cr

* cr

* Add README for community + core (langchain-ai#3637)

* Add Discord Tool (langchain-ai#3444)

* Added the discord get messages tool.

* Updated discord.js dependency

* Added send messages tool

* Added the discord channel search tool.

* Fixed syntax issue

* Added get servers & get text channels

* Added documentation and examples

* Rename discord.test.ts to discord.int.test.ts

* Rename discord.test.ts to discord.int.test.ts

* Passed yarn lint and yarn format

* Made botToken first argument in tools

* updated to single object instead of multiple args

* fixed discord arguments to use fields

* updated agent test

* Made requested changes

* Move to community

* Remove build artifacts

* Update lock

---------

Co-authored-by: your_username <arielleramgoolie27@gmail.com>
Co-authored-by: Arielle Ramgoolie <88518136+ArielleRamgoolie@users.noreply.github.com>
Co-authored-by: slairu <sarah22liu@gmail.com>
Co-authored-by: sh-hz <box.shzehi@gmail.com>
Co-authored-by: sh-hz <144635183+sh-hz@users.noreply.github.com>
Co-authored-by: Jacob Lee <jacoblee93@gmail.com>
Co-authored-by: Brace Sproul <braceasproul@gmail.com>

* community[patch]: removing null chars (langchain-ai#3628)

* tweak: file moved

* Format

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* core[patch],community[patch]: Make traced DynamicTool runs use tool name (langchain-ai#3635)

* Make traced DynamicTool runs use tool name

* Revert test

* Fix typo

* feat: add inference for RunnableMap RunOutput type (langchain-ai#3517)

* add RunnableMapLike to infer RunnableMap output

* remove unneeded changes

* fix linting

* format

* fix runnable_stream_log.test

* upgrade typescript version

* clean types

* fix structured_output_runnables.int.test

* ts version ~5.1.6

* remove unused eslint-disable-next-line

* remove another disable no-explicit-any

* remove another no-explicit-any

* move eslint

* Format

* Default runnable maps to any type in case inference is not possible

* Add tests

---------

Co-authored-by: David Illing <dilling123@gmail.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* Bump versions

* Update lock

* Fix export test

* Release 0.0.206

* Bump (langchain-ai#3640)

* docs[patch]: Add yarn clean script to docs & run in build (langchain-ai#3646)

* docs[patch]: Add yarn clean script to docs & run in build

* cr

* dont use turbo to build api refs (langchain-ai#3647)

* langchain[patch]: Issue langchain-ai#2756 Add Qdrant custom payload on documents to query them by filter (langchain-ai#3431)

* Add optional custom payload param and 1 test

* Add Custom Payload in Documents

* Resolve comments in PR

* Add document changes to langchain-core

* Test because all yarn test cases are passing

* Update tests and fix yarn lint

* Remove payload from Document and add object[]

* Remove object[] and replace with objects +comments

* Update add document types

---------

Co-authored-by: dom_ <luszczynski.dominik@gmail.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* pass in authoptions correctly from initialization down to connection sdk (langchain-ai#3598)

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* community[major]: Added integration with new Gemini API (langchain-ai#3621)

* Added integration with new Gemini API

* added to requiresOptionalDependency

* reverted old models

* fixed linting

* chat

* Cleanup

* Run format

* Update deps

* Move to chat model, add tests

* Add docs markdown skeleton

* Move deprecation notices around

* Docs update

* Fix dependency issue

* Fix docs path

* moved conversion function

* cleanup

* Update lockfile

* minor cleanup

* docs indent

* More updates to docs

* removed enum imports

* fixed enum imports/exports

* Docs

* import order

* Fix lint

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* Update @langchain/google-genai versioning (langchain-ai#3648)

* Use more permissive dependency range for side packages

* Update package.json

* Small docs update

* Adds delete method to PGVectorStore (langchain-ai#3590)

* added delete method to pgvectorstore

* added  tests for pgvectorstore delete method

* fix comments

* Add example

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* Update AssemblyAI SDK (langchain-ai#3599)

* Use latest AAI SDK

* Update to latest AAI SDK

* langchain-mistralai[major]: Add MistralAI chat and embed (langchain-ai#3623)

* langchain-mistralai[major]: Add MistralAI chat and embed

* chore: lint files

* chore: lint files

* cr

* docs

* cr

* install

* fix tests

* fix docs

* chore: lint files

* dont use turbo to build api refs

* cr

* cr

* yarn

* extend base instead of simple chat model

* chore: lint files

* add docs for embeddings

* chore: lint files

* core v to 0.1.0

* cr

* chore: lint files

* Fix lint

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* Update test

* Bump community

* Release 0.0.207

* Adds integration package installation instructions (langchain-ai#3650)

* Adds integration package installation instructions

* Update Mistral docs

* Remove old warning

* Change wording (langchain-ai#3652)

* docs[patch]: Added missing LangSmith trace link in mistral docs (langchain-ai#3659)

* bugfixes in google Generative AI chat_model (langchain-ai#3657)

* adding candidateCount + fixing stopSequences not being set in the request

* Update chat_models.ts

* Small Google fixes

* Fix test

* Format

---------

Co-authored-by: Jacob Lee <jacoblee93@gmail.com>

* Bump google genai version (langchain-ai#3660)

* Bump version

* Update yarn lock

* community[major]: Add Together AI LLM integration  (langchain-ai#3627)

* cr

* lint

* added docs & created entrypoint

* chore: lint files

* all[patch]: Ensure other subpackages are built before test/build (langchain-ai#3624)

* Ensure other subpackages are built before test/build

* Fix test

* Fix API ref build

* Fix docs build

* Fix build

* Fix build

* Build serially

* Fix build

* Bump core

* Release 0.0.205

* added docs & created entrypoint

* cr

* fixed stream

* cr

* chore: lint files

* fix example

* chore: lint files

* streaming example

* format

* add langsmith

---------

Co-authored-by: Jacob Lee <jacoblee93@gmail.com>

* Update README and contributing guidelines (langchain-ai#3666)

* Update README and contributing guidelines

* Fix links

* Fix typo

* Fix links

* Fix link

* Fix links

* Update README.md

Co-authored-by: Brace Sproul <braceasproul@gmail.com>

---------

Co-authored-by: Brace Sproul <braceasproul@gmail.com>

* Update OpenAI embeddings `stripNewLines` to be default `false` (langchain-ai#3612)

* Updating the default value for `stripNewLines` to `false` in the `openai` embeddings. As removing newlines was beneficial for `V1` models (or `-001`), but should not be mandatory for `V2` models (or `-002`). This is explained in openai/openai-python#418 (comment)

Therefor updating this field to be in line with the default set model `text-embedding-ada-002`.

Also the langchain python library only enables this for `-001` models: https://github.com/langchain-ai/langchain/blob/c0f4b95aa9961724ab4569049b4c3bc12ebbacfc/libs/langchain/langchain/embeddings/openai.py#L466

* Reverting the default value, so it's `false` by default again.
Marked with a comment to indicate this should be changed in a future minor release.
Referenced the PR, as it contains the necessary information as to why this should be updated.

* Resolving conflicts, adding changes again to new location.

* Fix invert runId and threadId (langchain-ai#3665)

Fix invert runId and threadId to match openai.beta.threads.runs.submitToolOutputs arguments

* experimental[patch]: Improve AutoGPT's output_parser to extract JSON code block (langchain-ai#3656)

* Improve output_parser to extract JSON code block

Closes langchain-ai#3655

* Ran yarn format and lint

* community[patch]: Fix RRF normalization and lucene characters for neo4j vector (langchain-ai#3653)

* Fix RRF normalization and lucene characters for neo4j vector

* Formatting

* community[patch]: Update ElasticSearch mappings to successfully add documents from TextSplitter (langchain-ai#3629)

* failing test that shows how the loc format from text splitters conflicts with elasticsearch mappings

* explicitly declare .metadata.loc as an object in elasticsearch

* Throw an error if inserting vectors into elasticsearch fails.

* Lint + docs format

* Update docs

* Format

---------

Co-authored-by: Brace Sproul <braceasproul@gmail.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* docs[patch]: Update LangChain README (langchain-ai#3669)

* Update LangChain README

* Update link

* Update example

* mistral[minor]: Dynamically import mistral (langchain-ai#3670)

* mistral[minor]: Dynamically import mistral

* dynamic import for embeddings

* cr

* chore: lint files

* Bump version (langchain-ai#3671)

* Bump version (langchain-ai#3672)

* community[minor]: Adds chat endpoint and multimodal support for Ollama (langchain-ai#3673)

* Adds chat endpoint support for Ollama

* Fix build

* Export for consistency

* Add multimodal support

* Fix lint + format

* Bump community version

* Release 0.0.208

* updated docs to reflect usage of PaLM based classes (langchain-ai#3678)

* mistral[minor]: Fix assigning class properties (langchain-ai#3681)

* mistral[minoir]: Fix assigning class properties

* cr

* chore: lint files

* bump v to 0.0.3

* docs[patch]: Use mixtral default model for togetherai (langchain-ai#3679)

* docs[patch]: Use mixtral default model for togetherai

* chore: lint files

* update langsmith link

* Fix 1 typo in Assistants Docs and added 2 extra suggestions (langchain-ai#3680)

* Fix typo for fileIds in openai_assistant.mdx

* Update openai_assistant.mdx for preventing apiKey errors

* docs[patch]: Add section about req optional dep (langchain-ai#3689)

* docs[patch]: Add section about req optional dep

* Update CONTRIBUTING.md

* core[patch]: Use an interface for runnables to allow more compatibility between core versions (langchain-ai#3684)

* Use an interface for runnables to allow more compatibility between core versions

* Lint + format

* Remove unnecessary exports

* Add transform to required methods

* core[patch]: Add LLM/ChatModel callbacks to cached generation (langchain-ai#3392)

* Add _generateCached callback to chat_models

* Add _generateCached to llms

* Fix formatting

* Remove unused ignore

* Pass llmStringKey as parameter

* Wrap generateCached arguments into object

* Add coment for defineProperty block

* Fix run managers getting filtered out

* Fix formatting

* Naming nit

* Use more type imports

* Add language model callback tests

---------

Co-authored-by: Brace Sproul <braceasproul@gmail.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>

* integrations[minor]: Add readmes to add integrations, improve templat… (langchain-ai#3683)

* integrations[minor]: Add readmes to add integrations, improve template readme

* add default readme

* chore: lint files

* cr

* community[minor]: Jacob/vectara summarization (langchain-ai#3636)

* updated documentation
added X-Source to header

* initial

* added MMR
VectaraRetriever now supports summarization as an integral part of the flow

* updated example and bugfix

* updated tests

* updated tests

* after yarn format

* Renamed VectaraRetriever to VectaraSummaryRetriever
Moved to langchain/retrievers/

---------

Co-authored-by: Ofer Mendelevitch <ofer@vectara.com>
Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>

* Add missing max_tokens option to TogetherAI (langchain-ai#3687)

* init improvements, add index options, multiple binds per filter (langchain-ai#3579)

resolving initialization issues, adding

Co-authored-by: Phil Miesle <phil.miesle@datastax.com>

* Update READMEs (langchain-ai#3693)

* Bump package versions

* Release 0.0.209

* Bump version

* Fixed errors 

* Added AzureML LLM (#1)

* Added azure_ml llm endpoint

* Added azure_ml entrypoint

* Added field check & prettified code

* Fixed error string

* Fixed LLM string

* Added azure ml llm constants & type & example usage

* Update gitignore & package.json

* Added bad response check

* Added comments on example

* Added int test for azure ml

* Added doc integration for Azure ML llm

* did azureml_chat and made changes to azureml_llm

* Added requested changes (#4)

* Made requesed changes

* Formatted

---------

Co-authored-by: Vis <vishakanshanthakumar@gmail.com>

* Fixed unused imports error

* Prettified test file

---------

Co-authored-by: Vis <vishakanshanthakumar@gmail.com>
Co-authored-by: Vishakan <152434517+univish@users.noreply.github.com>

* docs: keywords (langchain-ai#3705)

* docs: keywords

* format

* google-genai[patch]: Hookup callbacks to stream & generate methods (langchain-ai#3708)

* google-genai[patch]: Hookup callbacks to stream & generate methods

* chore: lint files

* core[minor]: Move chunk array to core (langchain-ai#3711)

* all[patch]: Fix typing across different core versions by using interfaces instead of abstract classes (langchain-ai#3709)

* Make model reliant modules use BaseLanguageModelInterface instead of BaseLanguageModel

* Fix import

* Adds BaseRetrieverInterface

* Adds prompt value interface

* Format

* Revert

* Use document and embeddings interfaces

* Use vectorstore interface

* Adds tool and structured tool interfaces

* Use type imports

* Use type import

* examples[patch]: Fixes type error in vectara example (langchain-ai#3719)

* examples[patch]: Fixes type error in vectara example

* chore: lint files

* cr

* Add stop to Together AI (langchain-ai#3714)

Signed-off-by: Sunghyun Hwang <hwang@hey.com>

* Use new Anthropic beta endpoint in new package (langchain-ai#3720)

* Use new Anthropic beta endpoint in new package

* Update docs

* Fix stop sequence binding

* Fix lint

* Update lock

* Version bumps

* Switch version

* Release 0.0.210

* small requested changes to docs

* community[tests]: Add docker-compose for easier testing of pgvector (langchain-ai#3723)

* community[tests]: Add docker compose for easier testing of pgvector

* cr

* chore: lint files

* Update Anthropic docs (langchain-ai#3728)

* all[minor]: Remove duplicated chunk arr code, import from core (langchain-ai#3731)

* all[minor]: Remove duplicated chunk arr code, import from core

* chore: lint files

* chore: lint files

* fix template core version

* community[major]: Together AI embeddings (langchain-ai#3729)

* community[major]: TogetherAI embeddings

* cr

* rm docs

* chore: lint files

* Implementing last requested changes

* Update azure_ml.int.test.ts

* community[patch]: Fix bad chunk array import (langchain-ai#3733)

* community[patch]: Fix bad chunk array import

* chore: lint files

* Fix deserialisation of additional_kwargs and tool_call_id (langchain-ai#3721)

* docs[major]: Generate API refs for all packages (langchain-ai#3690)

* docs[major]: Generate API refs for all packages

* cr

* cr

* chore: lint files

* remove src/ or libs/ from pathnames

* proper version & name

* chore: lint files

* chore: lint files

* cr

* cr

* cr

* cr

* filter with bang

* cr

* Update docs structure (langchain-ai#3736)

* community[minor]: Adds optional IDs parameter to PGVectorStore add-* methods (langchain-ai#3692)

* added delete method to pgvectorstore

* added  tests for pgvectorstore delete method

* fix comments

* Add example

* add ids param to pgvector add methods

* update doc comments

* add test for id insertion

* fix doc comments

* Change options arg for consistency with base class

* Change port to integration test default

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
Co-authored-by: bracesproul <braceasproul@gmail.com>

* all[major]: Better release workflow (langchain-ai#3717)

* all[minor]: Better release workflow

* cr

* added release-it & config files to all pkgs, and template

* tmp: rename mistralai to brace from langchain

* tmp: rename mistralai to brace from langchain

* add missing test infra to libs

* cr

* cr

* cr

* cr

* cr

* chore: lint files

* cr

* revert mistral workspace name change

* chore: lint files

* update pkg json script

* tmp change names

* cr

* cr

* cr

* cr

* cr

* cr

* try/catch around yarn install

* cr

* cr

* cr

* cr

* cr

* cr

* cr

* update scripts

* tmp make mistral basproul npm

* account for npm 2fa

* cr

* cr

* cr

* cr

* support for npm tags

* cr

* revert basproul changes

* chore: lint files

* cr

* drop release it script

* verify version with semver

* drop empty tests

* fix dep

* docs

* Update CONTRIBUTING.md

---------

Co-authored-by: Jacob Lee <jacoblee93@gmail.com>

* tried to undo something

* I messed up and am trying to fix it

* ran eslint on azure_ml files

* Prettified lint fixes

* Added back combineLLMOutput

---------

Signed-off-by: Sunghyun Hwang <hwang@hey.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
Co-authored-by: Brace Sproul <braceasproul@gmail.com>
Co-authored-by: David Duong <david@duong.cz>
Co-authored-by: Adeel Ehsan <aadeel.ehsan@gmail.com>
Co-authored-by: Adeel Ehsan <adeel.ehsan@wellthy.com>
Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
Co-authored-by: Sumit Kumar Purohit <sumopurohit@gmail.com>
Co-authored-by: Antoine Garcia <a.garcia.walecha@gmail.com>
Co-authored-by: Paolo Castro <castro.crea@gmail.com>
Co-authored-by: Gram Liu <gram.liu226@gmail.com>
Co-authored-by: Luiz Felipe <68920578+luiz-k-alencar@users.noreply.github.com>
Co-authored-by: Nicolas Juelle <n.juelle@gmail.com>
Co-authored-by: Michael Landis <michael@momentohq.com>
Co-authored-by: David Zhuang <i@dz.ax>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Devin Burnette <devin.burnette@betterment.com>
Co-authored-by: Alex Naymushin <alexander.naymushin@oxicom.ru>
Co-authored-by: Alex Naymushin <alexander.naymushin@omnigon.com>
Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com>
Co-authored-by: Oskar Hane <oh@oskarhane.com>
Co-authored-by: Mish Ushakov <10400064+mishushakov@users.noreply.github.com>
Co-authored-by: Mish Ushakov <mishushakov@users.noreply.github.com>
Co-authored-by: Dezhi Ren <55412122+DravenCat@users.noreply.github.com>
Co-authored-by: Kah Wai Liew <tureki@me.com>
Co-authored-by: Zhitao Xu <xuzhitao200020612@gmail.com>
Co-authored-by: CalebZhang <64758307+kz4ever@users.noreply.github.com>
Co-authored-by: Divyansh Kachchhava <114203084+divyanshuoft@users.noreply.github.com>
Co-authored-by: Alfred Tze-Hong Ha <47645447+xxlalfredo99@users.noreply.github.com>
Co-authored-by: CalebZhang <kevineview@gmail.com>
Co-authored-by: Alexander Claus <134403026+the-powerpointer@users.noreply.github.com>
Co-authored-by: Nigel Daniels <nigel.daniels@me.com>
Co-authored-by: Chase McDougall <chasemcdougall@hotmail.com>
Co-authored-by: Akshay Maurya <akshaymaurya3006@gmail.com>
Co-authored-by: phof <37412+phof@users.noreply.github.com>
Co-authored-by: jon <906671+jondwillis@users.noreply.github.com>
Co-authored-by: Jordy Hoolwerf <jordy.hoolwerf@hrorganizer.com>
Co-authored-by: Marcus Nätteldal <yohasse@outlook.com>
Co-authored-by: Marcus Nätteldal <marcus.natteldal@ltu.se>
Co-authored-by: Laurens Tsestigh <90600075+LaurensTsestigh@users.noreply.github.com>
Co-authored-by: Sanan Rao <raosanan@gmail.com>
Co-authored-by: Yuto Omachi <youomachi@gmail.com>
Co-authored-by: Nandhakishore K.S <n.krishnamurthy@mail.utoronto.ca>
Co-authored-by: Zeneos <everythingnotgaming@gmail.com>
Co-authored-by: Shareef P <shareefmorayur@gmail.com>
Co-authored-by: Abderrahim Mellouki <abderrahim.mellouki.p@gmail.com>
Co-authored-by: Tsukasa OISHI <tsukasa.oishi@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Uthman Mohamed <83053931+1239uth@users.noreply.github.com>
Co-authored-by: Hamoon Zamiri <hamoon.zamiri@mail.utoronto.ca>
Co-authored-by: saeedahsan <ahsan02@gmail.com>
Co-authored-by: SeannnX <122410542+SeannnX@users.noreply.github.com>
Co-authored-by: Hamoon <90403097+HamoonZamiri@users.noreply.github.com>
Co-authored-by: Ahsan Saeed <ahsanm.saeed@mail.utoronto.ca>
Co-authored-by: Eze <eactisgrosso@gmail.com>
Co-authored-by: Volodymyr Machula <machulav@gmail.com>
Co-authored-by: Tudor Golubenco <tudor@xata.io>
Co-authored-by: David Illing <dilling@users.noreply.github.com>
Co-authored-by: David Illing <dilling123@gmail.com>
Co-authored-by: Maciej Holyszko <14310995+falkenhawk@users.noreply.github.com>
Co-authored-by: Nolansym <gilliamja.te@gmail.com>
Co-authored-by: Dzmitry Dubarau <dzmitry.dubarau@ally.com>
Co-authored-by: Dzmitry A Dubarau <dzmitry.dubarau@gmail.com>
Co-authored-by: Maaneth De Silva <94875583+Maanethdesilva@users.noreply.github.com>
Co-authored-by: your_username <arielleramgoolie27@gmail.com>
Co-authored-by: Arielle Ramgoolie <88518136+ArielleRamgoolie@users.noreply.github.com>
Co-authored-by: slairu <sarah22liu@gmail.com>
Co-authored-by: sh-hz <box.shzehi@gmail.com>
Co-authored-by: sh-hz <144635183+sh-hz@users.noreply.github.com>
Co-authored-by: youngjaeheo2002 <101202147+youngjaeheo2002@users.noreply.github.com>
Co-authored-by: dom_ <luszczynski.dominik@gmail.com>
Co-authored-by: GG <95317664+pixelcatgg@users.noreply.github.com>
Co-authored-by: Alex Ostapenko <alx13@users.noreply.github.com>
Co-authored-by: MJDeligan <48515433+MJDeligan@users.noreply.github.com>
Co-authored-by: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
Co-authored-by: Haouari haitam Kouider <57036855+haouarihk@users.noreply.github.com>
Co-authored-by: Shady Al Shoha <48188608+shadyshoha@users.noreply.github.com>
Co-authored-by: Maytee Chinavanichkit <mayt@users.noreply.github.com>
Co-authored-by: Matt Raibert <mattraibert@positiondev.com>
Co-authored-by: Juanjo do Olmo <87780148+SimplyJuanjo@users.noreply.github.com>
Co-authored-by: Ofer Mendelevitch <ofer@vectara.com>
Co-authored-by: Allan Zimmermann <allan@openrpa.dk>
Co-authored-by: Phil Miesle <mieslep@users.noreply.github.com>
Co-authored-by: Phil Miesle <phil.miesle@datastax.com>
Co-authored-by: Vis <vishakanshanthakumar@gmail.com>
Co-authored-by: Vishakan <152434517+univish@users.noreply.github.com>
Co-authored-by: Sanjay Mylanathan <63125111+Ultrahi@users.noreply.github.com>
Co-authored-by: Sunghyun Hwang <hwang@prep.app>
Co-authored-by: Ultrahi <sanjaymylanathan85@gmail.com>
janus-dev87 added a commit to janus-dev87/llama-index-typescript that referenced this issue Mar 1, 2024
tazarov added a commit to amikos-tech/llama_index that referenced this issue May 3, 2024
- For models other than first gen text embeddings (-001) new line removal is not necessary

openai/openai-python#418
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants