Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the docs for developers (Create a Developers / Hackers guide for DataFusion) #5501

Closed
1 of 7 tasks
Tracked by #3058
alamb opened this issue Mar 7, 2023 · 3 comments · Fixed by #6056
Closed
1 of 7 tasks
Tracked by #3058

Improve the docs for developers (Create a Developers / Hackers guide for DataFusion) #5501

alamb opened this issue Mar 7, 2023 · 3 comments · Fixed by #6056
Assignees
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Mar 7, 2023

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
DataFusion is primarily aimed at developers, as explained in https://github.com/apache/arrow-datafusion#use-cases

Thus, it will help if provide documentation that helps developers understand what DataFusion offers and if it is appropriate for their project or not.

Describe the solution you'd like
I would like a DataFusion Architecture guide – aimed at other developers, that contains high level information about how DataFusion is organized (for example, with the content described in #5499)

The trick will be to keep the guide helpful but general enough that it doesn't get out of date too quickly

Topics:

  • Basic flow
  • Important structures (LogicalPlan, Exprs, SessionContext, PhysicalExpr, ExecutionPlan)
  • DataSources (TableProvider trait, etc)

Specific tasks

Describe alternatives you've considered

Additional context
Part of a larger effort to improve documentation #3058

@alamb alamb added the enhancement New feature or request label Mar 7, 2023
@alamb
Copy link
Contributor Author

alamb commented Apr 4, 2023

One thing I have been thinking about is where to put this information

So far I have found at least two possiblities:
https://arrow.apache.org/datafusion/contributor-guide/architecture.html
https://docs.rs/datafusion/latest/datafusion/index.html#parse-plan-optimize-execute

In general I think keeping the architecture / code documentation as close to the code as possible is a good idea, so I am leaning towards keeping in in doc comments

@alamb alamb self-assigned this Apr 5, 2023
@alamb
Copy link
Contributor Author

alamb commented Apr 5, 2023

I plan to work on this item over the next few weeks

@alamb
Copy link
Contributor Author

alamb commented Apr 8, 2023

I have begun cleaning up. First PR: #5921

@alamb alamb changed the title Create a Developers / Hackers guide for DataFusion Improve the docs for developers (Create a Developers / Hackers guide for DataFusion) Apr 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant