Skip to content

Conversation

@no-dice-io
Copy link
Contributor

Given that Llama Index has some of the most wide and robust support for various retrieval abstractions and methods, adding an interface from LlamaIndex into DSPy would allow many to leverage the various retrieval classes open in LlamaIndex.

https://docs.llamaindex.ai/en/stable/api_reference/retrievers/

@property
def similarity_top_k(self) -> int:
"""Return similarity top k of retriever."""
return self.retriever.similarity_top_k

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not every retriever in llama-index will have this property 👀 It kind of depends. The BaseRetriever class is fairly generic, and just exposes retrieve()/aretrieve() methods

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh thank you for pointing this out - I'll make an adjustment shortly!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @logan-markewich for looking into this!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I've resolved this - please review!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good, but the output should be typed as Optional[int] though

Kind of curious what the motivation is to expose this on the class? Is this something that dspy can optimize later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall if it can be optimized, but its a parameter that is often used with their forward method so I was looking for a way to pass it to the underlying LI retriever.

Copy link
Collaborator

@ammirsm ammirsm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution and help!

The PR overal LGTM, couple small stuffs which needs attention and we can approve and merge it.

pyproject.toml Outdated
psycopg2 = { version = "^2.9.9", optional = true }
pgvector = { version = "^0.2.5", optional = true }
structlog = "^24.1.0"
llama-index = "^0.10.30"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you think it should be optional?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it is causing lots of confusion but for the dependencies right now we use 3 different places... I have a PR to fix that in a short but if it is going to be merge before that you need to add your dependencies in requirements.txt (which use for building our package), here that you added it, and dependencies in the pyproject.toml ...

more info:
#819

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to making the dependency optional - but wouldn't my tests fail as you pointed out if we run that route?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the best way is to tag those tests to just be ran when the llama-index is installed and we fix the CI based on that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll go with that. I'll make the adjustment today or tomorrow. I'll only run those tests when llamaindex is present.



def test_lirm_as_rm(rag_setup):
"""Test the retriever as retriever method"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test will fail in the CI if we make that dependency optional...

@ammirsm
Copy link
Collaborator

ammirsm commented May 6, 2024

There are some merge conflicts in pyproject and poetry and I think after that we should be good to go.

@no-dice-io
Copy link
Contributor Author

There are some merge conflicts in pyproject and poetry and I think after that we should be good to go.

I believe I've resolved this now!

Copy link
Collaborator

@ammirsm ammirsm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@arnavsinghvi11 arnavsinghvi11 merged commit 5db3e34 into stanfordnlp:main May 21, 2024
@arnavsinghvi11
Copy link
Collaborator

Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants