Skip to content

Why does newline negatively impact embedding performance? #418

@ravwojdyla

Description

@ravwojdyla

Describe the bug

While reading the code of the embeddings_utils I have stumbled upon this:

# replace newlines, which can negatively affect performance.
text = text.replace("\n", " ")

Could you please provide more context on:

replace newlines, which can negatively affect performance.

Are there any references/papers/numbers behind that negative impact?

To Reproduce

get_embedding("foo bar\nbaz")

Code snippets

No response

OS

macOS

Python version

3.10

Library version

0.27.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions