Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closes #47 - updated datatype handling - Decimal, datetime.date, datetime.time #70

Merged
merged 18 commits into from
Oct 13, 2022

Conversation

shouples
Copy link
Collaborator

@shouples shouples commented Oct 12, 2022

Alright, so this is another "probably should've been multiple PRs" scenario. The overall theme here is "making it easier to handle or prepare for new/non-standard data types" with the display formatting.

  • adds generator / handler functions for decimal.Decimal, datetime.date, datetime.time types
  • restructures datatypes.py into a separate datatypes directory for easier future development
    • datatypes/main.py handles the top-level dataframe generation with available datatype generator helpers per column toggled
    • other files under datatypes are grouped into numeric/date_time/text/geometry/misc
  • add dedicated tests for each datatype handler
  • add datatypes/compatibility.py::test_compatibility to throw new data types and see which parts of the pipeline may break (between pandas.io.json.build_table_schema, duckdb conn.register(), jupyter_client.jsonutil.json_clean, or all of dx.handle_format)
    image

@shouples shouples changed the title updated datatype handling - Decimal, datetime.date, datetime.time Closes #47 - updated datatype handling - Decimal, datetime.date, datetime.time Oct 12, 2022
@codecov-commenter
Copy link

codecov-commenter commented Oct 13, 2022

Codecov Report

Merging #70 (2b8a5d1) into main (0e01d8d) will decrease coverage by 3.68%.
The diff coverage is 76.02%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #70      +/-   ##
==========================================
- Coverage   86.92%   83.24%   -3.69%     
==========================================
  Files          19       24       +5     
  Lines        1132     1253     +121     
==========================================
+ Hits          984     1043      +59     
- Misses        148      210      +62     
Impacted Files Coverage Δ
src/dx/utils/__init__.py 100.00% <ø> (ø)
src/dx/datatypes/compatibility.py 14.86% <14.86%> (ø)
src/dx/datatypes/date_time.py 72.22% <68.75%> (ø)
src/dx/datatypes/text.py 78.94% <78.94%> (ø)
src/dx/formatters/main.py 88.50% <83.33%> (ø)
src/dx/datatypes/misc.py 96.10% <96.10%> (ø)
src/dx/__init__.py 100.00% <100.00%> (ø)
src/dx/datatypes/__init__.py 100.00% <100.00%> (ø)
src/dx/datatypes/geometry.py 79.54% <100.00%> (ø)
src/dx/datatypes/main.py 100.00% <100.00%> (ø)
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0e01d8d...2b8a5d1. Read the comment docs.

@shouples shouples marked this pull request as ready for review October 13, 2022 20:25
@shouples shouples merged commit 816a165 into main Oct 13, 2022
@shouples shouples deleted the djs/decimal-handling branch October 13, 2022 21:01
shouples added a commit that referenced this pull request Nov 14, 2022
…time.time (#70)

* add Decimal handler and generator functions; clean up random_dataframe() arguments and add decimal_column/date_column/time_column
* add datetime.date and datetime.time generators and handlers
* check for and handle decimals and datetime.dates by default
* return gpd.GeoSeries instead of GeometryArray
* add boolean series generator option
* add datatype imports with new directory structure
* ignore flake8 C901 - "too complex"
* add datatype compatibility helpers
* add optional with_ipython_display argument to prevent calling IPython.display() on an object that goes through handle_format()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

preprocess datetime.date to pd.Timestamp/datetime values instead of strings
3 participants