Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow different data types per curve in data section reader #461

Merged
merged 6 commits into from
Apr 26, 2021

Conversation

kinverarity1
Copy link
Owner

@kinverarity1 kinverarity1 commented Apr 25, 2021

This commit changes the data section reader function (reader.py:read_data_section_iterative) to return a generator which yields the data ndarray (1D) for each curve in turn. The old behaviour was for the function to return a 2D ndarray.

It also:

  • reshaping of the data array to 2D now occurs solely inside the data section reader function, which has a new argument "n_columns" to allow this to happen
  • adds dtypes kwarg to las.py:LASFile.read
  • add reader.py:identify_dtypes_from_data function to implement dtypes="auto" (default value) by using the first row of
    data to automatically identify column data types
  • changes to las.py:LASFile.read to cater for the above

Consequences:

Details:

  • Coverage increased from 85% to 86%
  • Benchmark test from 1.4 s to 1.0 s ?
---------- coverage: platform linux, python 3.6.13-final-0 -----------
Name                       Stmts   Miss  Cover
----------------------------------------------
lasio/__init__.py             13      2    85%
lasio/convert_version.py      20     20     0%
lasio/defaults.py             11      0   100%
lasio/examples.py             42     10    76%
lasio/excel.py                88     34    61%
lasio/exceptions.py            6      0   100%
lasio/las.py                 419     60    86%
lasio/las_items.py           190     29    85%
lasio/las_version.py          50     15    70%
lasio/reader.py              435     27    94%
lasio/writer.py              171      9    95%
----------------------------------------------
TOTAL                       1445    206    86%
Coverage XML written to file coverage.xml


----------------------------------------------- benchmark: 1 tests ----------------------------------------------
Name (time in s)                Min     Max    Mean  StdDev  Median     IQR  Outliers     OPS  Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------
test_read_v12_sample_big     1.0652  1.0758  1.0700  0.0041  1.0694  0.0062       2;0  0.9346       5           1
-----------------------------------------------------------------------------------------------------------------

This commit changes the data section reader function
(reader.py:read_data_section_iterative()) to return
a generator which yields the data ndarray (1D) for each curve
in turn. The old behaviour was for the function to return a 2D
ndarray.

It also:

- reshaping of the data array to 2D now occurs solely inside
  the data section reader function, which has a new argument
  "n_columns" to allow this to happen
- adds dtypes kwarg to las.py:LASFile.read
- add reader.py:identify_dtypes_from_data() to implement
  dtypes="auto" (default value) by using the first row of
  data to automatically identify column data types
- changes to las.py:LASFile.read to cater for the above
@kinverarity1 kinverarity1 added the data-section-parser A bug or enhancement relating to the data section parser label Apr 25, 2021
lasio/las.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@dcslagel dcslagel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks good. I had one small change request.

@kinverarity1 kinverarity1 merged commit 5eb1854 into master Apr 26, 2021
@kinverarity1 kinverarity1 deleted the reshape-in-data-reader branch April 26, 2021 00:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data-section-parser A bug or enhancement relating to the data section parser
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data section containing non-numeric chars is parsed entirely as str
2 participants