Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHub Data Source Integration #1233

Merged
merged 20 commits into from
Oct 2, 2023
Merged

Conversation

xzdandy
Copy link
Collaborator

@xzdandy xzdandy commented Sep 28, 2023

  • GitHub Data Source Integration
  • Batching support for native storage engine. We can not do batching in storage engine, which does not work with limit. Revert the change.
  • Full NamedUser table support
  • Enable circle ci local PR cache for testmondata
  • Native storage engine read refactory
  • Testcases
  • Github data source documentation

@xzdandy xzdandy added the Data Sources Features/Bugs related to Data Sources label Sep 28, 2023
@xzdandy xzdandy added this to the v0.3.7 milestone Sep 28, 2023
@xzdandy xzdandy self-assigned this Sep 28, 2023
@xzdandy xzdandy linked an issue Sep 28, 2023 that may be closed by this pull request
2 tasks
@xzdandy
Copy link
Collaborator Author

xzdandy commented Sep 28, 2023

@gaurav274 @jiashenC Please review the design.

The purpose of the integration:

  1. The Github should be a data source instead of a user defined function, which improves the usability.
  2. One step forward towards a single query for stargazer analysis and optimization.
  3. The first / the template for a non-sql data source integration.

PS: testmon finally worked.

Copy link
Member

@jiashenC jiashenC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall design looks good to me. My question is this is more like stargazer specific handler right? If we later want to add other features of Github APIs, what is the plan to add those?

@xzdandy
Copy link
Collaborator Author

xzdandy commented Sep 28, 2023

The overall design looks good to me. My question is this is more like stargazer specific handler right? If we later want to add other features of Github APIs, what is the plan to add those?

We will add another table mapping in the supported_table.

.circleci/config.yml Show resolved Hide resolved
.circleci/config.yml Show resolved Hide resolved
@jarulraj
Copy link
Member

@kaushikravichandran Can you help add a docs PR for the app integration handler steps? thanks!

@kaushikravichandran kaushikravichandran added the Integrations 🧩 Pull requests that update an integration label Sep 29, 2023
)


class GithubHandler(DBHandler):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaushikravichandran @xzdandy Do we want to merge #1033 PR first?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd like to introduce SELECT, WRITE, CREATE, DROP for the non-sql data source integration. And use them in the native_storage_engine.py. Some efforts are needed to refactor the pull request. To unblock the efforts for optimization in a short time period, I choose to do an implementation for Github data source first. This PR includes the SELECT part, which can be a reference for the rest interfaces.

@xzdandy xzdandy marked this pull request as ready for review September 30, 2023 09:01
@xzdandy xzdandy modified the milestones: v0.3.7, v0.3.8 Sep 30, 2023
@xzdandy xzdandy merged commit 495ce7d into staging Oct 2, 2023
7 checks passed
@xzdandy xzdandy deleted the 1231-github-data-source-intergration branch October 2, 2023 06:45
a0x8o pushed a commit to alexxx-db/eva that referenced this pull request Oct 30, 2023
- [x] GitHub Data Source Integration
- [x] Batching support for native storage engine. We can not do batching
in storage engine, which does not work with limit. Revert the change.
- [x] Full NamedUser table support
- [x] Enable circle ci local PR cache for testmondata
- [x] Native storage engine `read` refactory
- [x] Testcases
- [x] Github data source documentation
a0x8o pushed a commit to alexxx-db/eva that referenced this pull request Oct 30, 2023
- [x] GitHub Data Source Integration
- [x] Batching support for native storage engine. We can not do batching
in storage engine, which does not work with limit. Revert the change.
- [x] Full NamedUser table support
- [x] Enable circle ci local PR cache for testmondata
- [x] Native storage engine `read` refactory
- [x] Testcases
- [x] Github data source documentation
a0x8o pushed a commit to alexxx-db/eva that referenced this pull request Nov 22, 2023
- [x] GitHub Data Source Integration
- [x] Batching support for native storage engine. We can not do batching
in storage engine, which does not work with limit. Revert the change.
- [x] Full NamedUser table support
- [x] Enable circle ci local PR cache for testmondata
- [x] Native storage engine `read` refactory
- [x] Testcases
- [x] Github data source documentation
a0x8o pushed a commit to alexxx-db/eva that referenced this pull request Nov 22, 2023
- [x] GitHub Data Source Integration
- [x] Batching support for native storage engine. We can not do batching
in storage engine, which does not work with limit. Revert the change.
- [x] Full NamedUser table support
- [x] Enable circle ci local PR cache for testmondata
- [x] Native storage engine `read` refactory
- [x] Testcases
- [x] Github data source documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data Sources Features/Bugs related to Data Sources Integrations 🧩 Pull requests that update an integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Github Data Source Intergration
5 participants