Skip to content

Conversation

anmyachev
Copy link
Contributor

@anmyachev anmyachev commented May 12, 2019

Gap in performance between use dayfirst or format is acceptable now.

asv run -E existing -b ParseDateComparison -a warmup_time=2 -a sample_time=2:

io.csv.ParseDateComparison.time_read_csv_dayfirst parameter time
- cache_dates -
- False 16.3±0.4ms
- True 7.22±0.2ms
io.csv.ParseDateComparison.time_to_datetime_dayfirst parameter time
- cache_dates -
- False 15.0±0.3ms
- True 5.67±0.2ms
io.csv.ParseDateComparison.time_to_datetime_format_DD_MM_YYYY parameter time
- cache_dates -
- False 25.0±0.4ms
- True 5.47±0.4ms

@jreback
Copy link
Contributor

jreback commented May 12, 2019

great, can you show results for thee in the top-section.

@jreback jreback added IO CSV read_csv, to_csv Performance Memory or execution speed performance Datetime Datetime data dtype labels May 12, 2019
@jreback jreback added this to the 0.25.0 milestone May 12, 2019
@codecov
Copy link

codecov bot commented May 12, 2019

Codecov Report

Merging #26360 into master will decrease coverage by 0.13%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #26360      +/-   ##
==========================================
- Coverage   91.81%   91.68%   -0.14%     
==========================================
  Files         175      174       -1     
  Lines       52289    50700    -1589     
==========================================
- Hits        48009    46482    -1527     
+ Misses       4280     4218      -62
Flag Coverage Δ
#multiple 90.18% <ø> (-0.18%) ⬇️
#single 41.19% <ø> (+0.32%) ⬆️
Impacted Files Coverage Δ
pandas/io/gbq.py 78.94% <0%> (-10.53%) ⬇️
pandas/core/frame.py 97.01% <0%> (-0.12%) ⬇️
pandas/io/parsers.py

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4a30fa5...3b1034b. Read the comment docs.

@codecov
Copy link

codecov bot commented May 12, 2019

Codecov Report

Merging #26360 into master will decrease coverage by 0.12%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #26360      +/-   ##
==========================================
- Coverage   91.81%   91.68%   -0.13%     
==========================================
  Files         175      174       -1     
  Lines       52289    50749    -1540     
==========================================
- Hits        48009    46530    -1479     
+ Misses       4280     4219      -61
Flag Coverage Δ
#multiple 90.19% <ø> (-0.17%) ⬇️
#single 41.16% <ø> (+0.29%) ⬆️
Impacted Files Coverage Δ
pandas/io/gbq.py 78.94% <0%> (-10.53%) ⬇️
pandas/compat/numpy/function.py 90.39% <0%> (-0.41%) ⬇️
pandas/core/frame.py 97.02% <0%> (-0.12%) ⬇️
pandas/core/series.py 93.67% <0%> (ø) ⬆️
pandas/io/parsers.py
pandas/core/indexes/base.py 96.72% <0%> (ø) ⬆️
pandas/core/arrays/integer.py 96.35% <0%> (+0.02%) ⬆️
pandas/core/sparse/frame.py 95.63% <0%> (+0.14%) ⬆️
pandas/core/arrays/sparse.py 92.7% <0%> (+0.39%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4a30fa5...e33284d. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented May 14, 2019

@anmyachev

@jreback Should I make the conditions for testing the same? (for this, most likely I will have to create new classes)

yes making these consistent for benchmarks (and testing both cache on/off would be great)

@anmyachev
Copy link
Contributor Author

yes making these consistent for benchmarks (and testing both cache on/off would be great)

@jreback this is done.

@jreback jreback merged commit e5d15b2 into pandas-dev:master May 14, 2019
@jreback
Copy link
Contributor

jreback commented May 14, 2019

thanks @anmyachev

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype IO CSV read_csv, to_csv Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

read_csv : using day first 23x to 35x slower than setting the format explicitly
2 participants