Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-implementing the Postgres memory store #1735

Merged
merged 14 commits into from
Jun 29, 2023

Conversation

JadynWong
Copy link
Contributor

@JadynWong JadynWong commented Jun 27, 2023

Motivation and Context

Reimplement Postgres in-memory storage based on the discussion in #1338.

Description

  • Re-implementing the Postgres memory store, mapping between SK collection to Postgres table.
  • PostgresMemoryStore no longer implements IDisposable pattern.
  • No longer execute the enable pgvector extension statement, add it to README.md to be executed by the user. It only needs to be executed once in the database, and the extension may be enabled differently for different hosting methods.

Contribution Checklist

@JadynWong JadynWong requested a review from a team as a code owner June 27, 2023 20:25
@github-actions github-actions bot added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel labels Jun 27, 2023
@JadynWong
Copy link
Contributor Author

Hi @dmytrostruk,
Following the previous discussion on implementing the new design, I hope you can review it and give suggestions.
Thank you.

@dmytrostruk dmytrostruk self-assigned this Jun 28, 2023
Copy link
Member

@dmytrostruk dmytrostruk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JadynWong thank you for contribution and quick action for Postgres memory store. While this implementation has better design and resolves a lot of issues of previous version, there are still a couple of problems to resolve. I think they are not critical but will improve the functionality even more. Thanks again!

@dmytrostruk dmytrostruk added PR: feedback to address Waiting for PR owner to address comments/questions memory connector ai connector Anything related to AI connectors labels Jun 28, 2023
@dmytrostruk dmytrostruk mentioned this pull request Jun 28, 2023
5 tasks
dmytrostruk
dmytrostruk previously approved these changes Jun 29, 2023
Copy link
Member

@dmytrostruk dmytrostruk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JadynWong thank you for quick action on PR comments! I tested the functionality using integration tests together with Example39_Postgres - it works as expected! 🎉
Thanks for adding unit tests, taking into account that integration tests are temporarily ignored, unit tests will increase our code coverage.

I left minor comments in documentation and tests. As soon as they are resolved, this PR will be ready to merge to main.


Ideas for improvements in future (this can be done in follow-up PRs):
Since we have PostgresDbClient which is responsible for sending Postgres queries, it would be better if we operate in this class with Postgres entities.

For example, IPostgresDbClient interface has method GetCollectionsAsync, but there is no entity collection in Postgres, there is table. So, what if methods in IPostgresDbClient will be like GetTablesAsync, CreateTableAsync etc?

Another example, in PostgresDbClient we have method CreateCollectionAsync and in this method there are two calls CreateTableAsync and CreateIndexAsync, which means that this class defines a logic for creating specific table in Postgres which should act as SK collection, but this is responsibility of PostgresMemoryStore class.

If we follow single responsibility principle, there would be:

  • PostgresDbClient - which operates only with Postgres entities and is responsible for sending simple queries for create/delete tables, records, search.
  • PostgresMemoryStore - which will work as adapter between Semantic Kernel and Postgres, using PostgresDbClient to operate with Postgres entities.

Same applies to other connectors where we have the same problem.

@JadynWong
Copy link
Contributor Author

Ideas for improvements in future (this can be done in follow-up PRs): Since we have PostgresDbClient which is responsible for sending Postgres queries, it would be better if we operate in this class with Postgres entities.

For example, IPostgresDbClient interface has method GetCollectionsAsync, but there is no entity collection in Postgres, there is table. So, what if methods in IPostgresDbClient will be like GetTablesAsync, CreateTableAsync etc?

Another example, in PostgresDbClient we have method CreateCollectionAsync and in this method there are two calls CreateTableAsync and CreateIndexAsync, which means that this class defines a logic for creating specific table in Postgres which should act as SK collection, but this is responsibility of PostgresMemoryStore class.

If we follow single responsibility principle, there would be:

  • PostgresDbClient - which operates only with Postgres entities and is responsible for sending simple queries for create/delete tables, records, search.
  • PostgresMemoryStore - which will work as adapter between Semantic Kernel and Postgres, using PostgresDbClient to operate with Postgres entities.

Same applies to other connectors where we have the same problem.

Great advice, I'd like to finish it in a follow up PR.

Copy link
Member

@dmytrostruk dmytrostruk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@dmytrostruk dmytrostruk added PR: ready for review All feedback addressed, ready for reviews PR: ready to merge PR has been approved by all reviewers, and is ready to merge. and removed PR: feedback to address Waiting for PR owner to address comments/questions PR: ready for review All feedback addressed, ready for reviews labels Jun 29, 2023
@shawncal shawncal added this pull request to the merge queue Jun 29, 2023
Merged via the queue into microsoft:main with commit 07aa6a7 Jun 29, 2023
10 checks passed
@JadynWong JadynWong deleted the jadyn/re-implement-postgres-memory branch June 30, 2023 14:42
@evchaki evchaki added this to the Sprint 34 milestone Jun 30, 2023
@JadynWong JadynWong mentioned this pull request Jun 30, 2023
5 tasks
github-merge-queue bot pushed a commit that referenced this pull request Jul 3, 2023
### Motivation and Context
<!-- Thank you for your contribution to the semantic-kernel repo!
Please help reviewers and future users, providing the following
information:
  1. Why is this change required?
  2. What problem does it solve?
  3. What scenario does it contribute to?
  4. If it fixes an open issue, please link to the issue here.
-->
Follow
#1735 (review)
advice to rename methods and separate responsibilities.

### Description
<!-- Describe your changes, the overall approach, the underlying design.
These notes will help understanding how your code works. Thanks! -->
- Separate the responsibilities of PostgresMemoryStore and
PostgresDbClient.
- Add the real batch methods to PostgresDbClient.
- Save timestamp with `TIMESTAMP WITH TIME ZONE` type
- Stop creating index

### Contribution Checklist
<!-- Before submitting this PR, please make sure: -->
- [x] The code builds clean without any errors or warnings
- [x] The PR follows SK Contribution Guidelines
(https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
- [x] The code follows the .NET coding conventions
(https://learn.microsoft.com/dotnet/csharp/fundamentals/coding-style/coding-conventions)
verified with `dotnet format`
- [x] All unit tests pass, and I have added new tests where possible
- [x] I didn't break anyone 😄

---------

Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
shawncal pushed a commit to shawncal/semantic-kernel that referenced this pull request Jul 6, 2023
### Motivation and Context
<!-- Thank you for your contribution to the semantic-kernel repo!
Please help reviewers and future users, providing the following
information:
  1. Why is this change required?
  2. What problem does it solve?
  3. What scenario does it contribute to?
  4. If it fixes an open issue, please link to the issue here.
-->
Reimplement Postgres in-memory storage based on the discussion in microsoft#1338.


### Description
<!-- Describe your changes, the overall approach, the underlying design.
These notes will help understanding how your code works. Thanks! -->
- Re-implementing the Postgres memory store, mapping between SK
collection to Postgres table.
- PostgresMemoryStore no longer implements `IDisposable` pattern.
- No longer execute the enable pgvector extension statement, add it to
README.md to be executed by the user. It only needs to be executed once
in the database, and the extension may be enabled differently for
different hosting methods.

### Contribution Checklist
<!-- Before submitting this PR, please make sure: -->
- [x] The code builds clean without any errors or warnings
- [x] The PR follows SK Contribution Guidelines
(https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
- [x] The code follows the .NET coding conventions
(https://learn.microsoft.com/dotnet/csharp/fundamentals/coding-style/coding-conventions)
verified with `dotnet format`
- [x] All unit tests pass, and I have added new tests where possible
- [x] I didn't break anyone 😄

---------

Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
shawncal pushed a commit to shawncal/semantic-kernel that referenced this pull request Jul 6, 2023
### Motivation and Context
<!-- Thank you for your contribution to the semantic-kernel repo!
Please help reviewers and future users, providing the following
information:
  1. Why is this change required?
  2. What problem does it solve?
  3. What scenario does it contribute to?
  4. If it fixes an open issue, please link to the issue here.
-->
Follow
microsoft#1735 (review)
advice to rename methods and separate responsibilities.

### Description
<!-- Describe your changes, the overall approach, the underlying design.
These notes will help understanding how your code works. Thanks! -->
- Separate the responsibilities of PostgresMemoryStore and
PostgresDbClient.
- Add the real batch methods to PostgresDbClient.
- Save timestamp with `TIMESTAMP WITH TIME ZONE` type
- Stop creating index

### Contribution Checklist
<!-- Before submitting this PR, please make sure: -->
- [x] The code builds clean without any errors or warnings
- [x] The PR follows SK Contribution Guidelines
(https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
- [x] The code follows the .NET coding conventions
(https://learn.microsoft.com/dotnet/csharp/fundamentals/coding-style/coding-conventions)
verified with `dotnet format`
- [x] All unit tests pass, and I have added new tests where possible
- [x] I didn't break anyone 😄

---------

Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ai connector Anything related to AI connectors kernel Issues or pull requests impacting the core kernel memory connector .NET Issue or Pull requests regarding .NET code PR: ready to merge PR has been approved by all reviewers, and is ready to merge.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

4 participants