Skip to content

Commit

Permalink
core[patch]: Convert SimSIMD back to NumPy (langchain-ai#19473)
Browse files Browse the repository at this point in the history
This patch fixes the langchain-ai#18022 issue, converting the SimSIMD internal
zero-copy outputs to NumPy.

I've also noticed, that oftentimes `dtype=np.float32` conversion is used
before passing to SimSIMD. Which numeric types do LangChain users
generally care about? We support `float64`, `float32`, `float16`, and
`int8` for cosine distances and `float16` seems reasonable for
practically any kind of embeddings and any modern piece of hardware, so
we can change that part as well 🤗
  • Loading branch information
ashvardanian authored and rahul-trip committed Mar 27, 2024
1 parent f14322a commit 045a9a3
Show file tree
Hide file tree
Showing 4 changed files with 4 additions and 4 deletions.
2 changes: 1 addition & 1 deletion libs/community/langchain_community/utils/math.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ def cosine_similarity(X: Matrix, Y: Matrix) -> np.ndarray:
Z = 1 - simd.cdist(X, Y, metric="cosine")
if isinstance(Z, float):
return np.array([Z])
return Z
return np.array(Z)
except ImportError:
logger.info(
"Unable to import simsimd, defaulting to NumPy implementation. If you want "
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ def cosine_similarity(X: Matrix, Y: Matrix) -> np.ndarray:
Z = 1 - simd.cdist(X, Y, metric="cosine")
if isinstance(Z, float):
return np.array([Z])
return Z
return np.array(Z)
except ImportError:
X_norm = np.linalg.norm(X, axis=1)
Y_norm = np.linalg.norm(Y, axis=1)
Expand Down
2 changes: 1 addition & 1 deletion libs/partners/mongodb/langchain_mongodb/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ def cosine_similarity(X: Matrix, Y: Matrix) -> np.ndarray:
Z = 1 - simd.cdist(X, Y, metric="cosine")
if isinstance(Z, float):
return np.array([Z])
return Z
return np.array(Z)
except ImportError:
logger.info(
"Unable to import simsimd, defaulting to NumPy implementation. If you want "
Expand Down
2 changes: 1 addition & 1 deletion libs/partners/pinecone/langchain_pinecone/_utilities.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ def cosine_similarity(X: Matrix, Y: Matrix) -> np.ndarray:
Z = 1 - simd.cdist(X, Y, metric="cosine")
if isinstance(Z, float):
return np.array([Z])
return Z
return np.array(Z)
except ImportError:
X_norm = np.linalg.norm(X, axis=1)
Y_norm = np.linalg.norm(Y, axis=1)
Expand Down

0 comments on commit 045a9a3

Please sign in to comment.