Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate Scan Operators to Scan Function #2972

Open
andyfengHKU opened this issue Feb 29, 2024 · 0 comments
Open

Migrate Scan Operators to Scan Function #2972

andyfengHKU opened this issue Feb 29, 2024 · 0 comments
Assignees

Comments

@andyfengHKU
Copy link
Contributor

For the purpose of code maintenance, we should reduce the total number of operators in the code base. In particular, we should merge different scan operators into one operator and instead implement as different scan functions. If you think of all the scan operators/functions in the code base, there are

  1. scan csv
  2. scan parquet
  3. scan rdf
  4. scan numpy
  5. scan pandas
  6. scan factorized table
  7. scan union all table
  8. scan simple aggregate
  9. scan hash aggregate
  10. scan order by table
  11. scan node id
  12. scan column
  13. scan csr

For 1-6 we have already moved them as scan function inside InQueryCall (we should rename this operator to Scan in the end). We should do the same for the rest scan functions. 7-10 are in memory tables so their implementation should be similar to 6. 11 - 13 involves scanning from storage so I would suggest whoever implement them discuss with @ray6080 first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants