Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation checklist #2604

Open
33 of 42 tasks
samukweku opened this issue Sep 4, 2020 · 11 comments
Open
33 of 42 tasks

Documentation checklist #2604

samukweku opened this issue Sep 4, 2020 · 11 comments
Assignees
Labels
documentation EPIC ⭐ Big task that may encompass many smaller ones

Comments

@samukweku
Copy link
Collaborator

samukweku commented Sep 4, 2020

The aim here is to have a checklist of documentation suggestions, and also to keep track, in case others have suggestions or are already working on one, they can reference from here. Contributions/suggestions on what should be included as documentation are welcome.

Content

  • pandas vs datatable syntax comparison #2679

  • R data.table vs datatable syntax comparison #2611

  • DOCS: list of features from R which are (not) available in python #2400

  • SQL vs datatable syntax comparison #2665

  • add examples to functions in docstrings; this will help users understand how to use the functions quickly,

  • transformation documentation

    • Create a column/multiple columns
    • Mutate existing columns
    • Operation between columns
    • Apply function across columns (row-wise and column-wise)
    • Iteration through a frame
    • Sorting on a Frame
      - [ ] Aggregations (with and without groupby) - refer to by documentation
    • Datatable rules on column names during transformation
      - [ ] Combining columns/frames (rbind/cbind)
      - [ ] Comparing frames
    • Conditional transformations
  • Selecting and Filtering data (i and j)

    • Selection by label/position/callable/type? (j)
    • notes/warnings on combining selection options
    • Filtering/selecting rows in the i section (position/callable/type?)
    • Drop rows/columns
    • Selecting with missing labels
    • Warning on use of python keywords for column names
    • Single value access
    • Boolean indexing
    • Bracket notation vs attribute selection with f?
    • Note on irrelevance of index labels?
  • documentation on joins

    • Combining Frames horizontally and vertically (cbind/rbind)
    • Combining Frames in SQL-like manner (left join)
    • Limitations of SQL-like join
    • Comparing Frames
  • string operations in datatable

  • cookbook

  • updates to existing documentation (e.g fread ... more information on fread and excel (named ranges, ...)

  • iread examples how to use iread to read a list of csv #3353

  • Document all functions in the API: API Documentation checklist #2586

Technical

@st-pasha
Copy link
Contributor

st-pasha commented Sep 4, 2020

Hi Samuel,
thanks for taking up this initiative. As you probably noticed, we've spent a significant effort lately to improve our documentation. We're currently mostly focused on documenting the API (see #2586), but various tutorials / user guides might be just as useful if not more useful.
I've sent you an invite to join the project, so that you can create PRs directly (without going through the trouble of forking), and also better manage issues: assign labels, projects, milestones, etc.

@samukweku
Copy link
Collaborator Author

Yes, I noticed documentation on the API. I was particularly happy to see the guide on creating and adding a function to datatable. Thanks for the opportunity @st-pasha

@myamullaciencia
Copy link

Great Initiative @samukweku, I would choose one or two sections from the checklist and get back to yours here sooner. and i will let yours know if i need any help.

@samukweku
Copy link
Collaborator Author

@myamullaciencia cool! Will be on the lookout for your PR

@st-pasha
Copy link
Contributor

Samuel, have you seen #2400 -- do you think it's something that can be incorporated inside your data.table/datatable comparison page?

@st-pasha st-pasha added the EPIC ⭐ Big task that may encompass many smaller ones label Sep 16, 2020
@samukweku
Copy link
Collaborator Author

@st-pasha It is a good idea. I will make a PR to cover that.

st-pasha pushed a commit that referenced this issue Oct 8, 2020
- Doc highlighting similar operations in SQL and `datatable`
- Also highlights some operations in SQL that are currently not possible in `datatable`

References #2604
@samukweku
Copy link
Collaborator Author

@st-pasha I noticed that datatable docs does not get updated when new docs are added. Same goes for the changelog. Any reasons for this? Maybe a design decision? I believe users (me included) would love to get access to new information.

@st-pasha
Copy link
Contributor

I've set up a webhook just now, so hopefully it'll start building automatically now.

@samukweku
Copy link
Collaborator Author

@st-pasha No worries; I'll keep checking the site. At the moment, it has not updated.

@samukweku
Copy link
Collaborator Author

Website updates fine now

st-pasha pushed a commit that referenced this issue Jul 31, 2021
WIP - #2604

Update the select and filter data doc to use the Type class
oleksiyskononenko pushed a commit that referenced this issue Sep 20, 2021
Documentation on data transformation (column creation/mutation, frame iteration, operation between columns, frame sorting).
 
WIP for #2604
@samukweku
Copy link
Collaborator Author

  • update transform documentation on extend - it covers more than just a dictionary.
  • Add example to select documentation on columns with alias for renaming

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation EPIC ⭐ Big task that may encompass many smaller ones
Projects
Documentation
  
To do
Development

No branches or pull requests

3 participants