Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List of updates for version 2022.11 #34

Closed
17 of 28 tasks
root-11 opened this issue Oct 12, 2022 · 5 comments
Closed
17 of 28 tasks

List of updates for version 2022.11 #34

root-11 opened this issue Oct 12, 2022 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@root-11
Copy link
Owner

root-11 commented Oct 12, 2022

  • <-- tickmark means done in branch v2022_11_0. See comments for commit references.

Config

  • Setting so that tablite can run single core only if educators wants this.
  • allow multiple h5 files. tempdir is thereby the default storage, but user can point individual table to other storages. Make it possible to work with multiple HDF repos #22
  • allow user to set h5 file to "memory" and exclusively use RAM.

new tools.py

  • date_range(start, stop, step) like 2022/1/1, 2023/1/1, timedelta(days1) # returns list of datetimes.
  • add datatypes.guess
  • add xround

new Table methods

  • table.remove_duplicate_rows()
  • table.drop_na(*arg) removes rows with and None, np.nan
  • table.replace(target, replacement) which searches across all columns, e.g. t.replace(None, -1)
  • table.replace_missing_values(source=[...], target=column_name) which looks up nearest neighbour in sources and substitute into target. implement replace missing values with MP support #18
  • table.to_pandas()
  • Table.from_pandas(pd.DataFrame)
  • table.to_h5()
  • Table.from_h5()
  • table.to_dict(columns, slice) # to_dict returns python dict with column names as keys and lists of values as values. The optional slice permits an effective retrieval of a subset of rows.
  • Table.from_dict()
  • table.to_list() returns list of column names, + list of each column
  • Table.from_list(column names + data) (the inverse to .to_list()
  • table.transpose(columns=['Monday', 'Tuesday','Wednesday', 'Thursday', 'Friday'], as='day') turns the columns into a single column under the heading of 'day'
  • complete the list of importable formats.
  • remove requirement for column name declaration

new Column methods:

  • Column[:] returns list of native python types. No more numpy arrays or numpy types.
  • Column.to_numpy(slice) returns numpy's ndarray.

Documentation

Cleaner code:

  • test for page size limit. Add config to break early (90% of limit) or break late (110% of limit).
  • test for various np broadcast functions.
@root-11 root-11 added the enhancement New feature or request label Oct 12, 2022
@root-11 root-11 self-assigned this Oct 12, 2022
@root-11
Copy link
Owner Author

root-11 commented Oct 18, 2022

See commit cb337f2 for date_range, xround and guess

@root-11
Copy link
Owner Author

root-11 commented Oct 18, 2022

See commit 9c82080 for drop

@root-11
Copy link
Owner Author

root-11 commented Nov 5, 2022

Tablite 2022.11 is available in pre-release as 2022.11.dev1 on pypi.

@root-11
Copy link
Owner Author

root-11 commented Nov 20, 2022

Added completed changes in the changelog. https://github.com/root-11/tablite/blob/master/changelog.md

@root-11
Copy link
Owner Author

root-11 commented Jul 11, 2023

All completed in 2023.

@root-11 root-11 closed this as completed Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant