ENH: context-manager for chunksize/iterator-reader #38225

twoertwein · 2020-12-02T02:54:26Z

Allows (and encourages) the following use:

import pandas as pd

filename = "pandas/tests/io/data/csv/iris.csv"
chunksize = 2
with pd.read_csv(filename, chunksize=chunksize) as reader:
    for chunk in reader:
        # risky code that might raise

Same can be done for read_json/sas (I think these are all methods that support chunksize). If this PR should make it into 1.2, I can quickly add the changes for json/sas as well.

Are there more places to promote this new context manager?

jreback · 2020-12-02T03:09:21Z

this looks pretty neat. can you add a small note section in io.rst which shows this off. i think this is ok for 1.2, can add json/sas as followsup.

twoertwein · 2020-12-02T15:11:15Z

I don't understand why the documentation example is failing:

Exception in /home/runner/work/pandas/pandas/doc/source/user_guide/io.rst at block ending on line 1586
ParserError: Error tokenizing data. C error: out of memory

edit: the blank line was causing the issue, it was interpreted as the end of the with-block

doc/source/user_guide/io.rst

jreback

great. just some minor doc omments. ping on green.

doc/source/user_guide/io.rst

doc/source/whatsnew/v1.2.0.rst

pandas/io/parsers.py

pandas/io/sas/sasreader.py

twoertwein · 2020-12-04T18:28:24Z

@jreback green. I hope the whatsnew entry is good now

jreback · 2020-12-04T18:58:53Z

thanks @twoertwein very nice

jreback added the IO CSV read_csv, to_csv label Dec 2, 2020

twoertwein force-pushed the contextmanagers branch 4 times, most recently from b820965 to 4a59498 Compare December 2, 2020 06:22

twoertwein changed the title ~~ENH: context-manager for TextFileReader~~ ENH: context-manager for TextFile/JSON/SASReader Dec 2, 2020

twoertwein marked this pull request as ready for review December 2, 2020 07:37

twoertwein force-pushed the contextmanagers branch 3 times, most recently from 0871fb8 to a0fb4f7 Compare December 3, 2020 18:18

twoertwein changed the title ~~ENH: context-manager for TextFile/JSON/SASReader~~ ENH: context-manager for chunksize/iterator-reader Dec 3, 2020

jreback requested changes Dec 3, 2020

View reviewed changes

doc/source/user_guide/io.rst Show resolved Hide resolved

twoertwein force-pushed the contextmanagers branch 4 times, most recently from 7af38d3 to 0b10951 Compare December 4, 2020 14:52

jreback requested changes Dec 4, 2020

View reviewed changes

doc/source/user_guide/io.rst Show resolved Hide resolved

doc/source/user_guide/io.rst Show resolved Hide resolved

doc/source/whatsnew/v1.2.0.rst Outdated Show resolved Hide resolved

pandas/io/parsers.py Outdated Show resolved Hide resolved

pandas/io/sas/sasreader.py Show resolved Hide resolved

jreback added this to the 1.2 milestone Dec 4, 2020

twoertwein added 2 commits December 4, 2020 12:03

ENH: context-manager for chunksize/iterator-reader

bfc8e96

new lines

0c61786

twoertwein force-pushed the contextmanagers branch from 0b10951 to 0c61786 Compare December 4, 2020 17:11

jreback approved these changes Dec 4, 2020

View reviewed changes

jreback merged commit 5011a37 into pandas-dev:master Dec 4, 2020

twoertwein deleted the contextmanagers branch December 4, 2020 20:40

antonymilne mentioned this pull request Jun 1, 2021

[KED-2639] Cannot read csv in chunks with pandas kedro-org/kedro#598

Closed

gwaybio mentioned this pull request Jul 9, 2021

add a configurable chunksize whenever loading broadinstitute/pooled-cell-painting-profiling-recipe#78

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: context-manager for chunksize/iterator-reader #38225

ENH: context-manager for chunksize/iterator-reader #38225

twoertwein commented Dec 2, 2020 •

edited

Loading

jreback commented Dec 2, 2020

twoertwein commented Dec 2, 2020 •

edited

Loading

jreback left a comment

twoertwein commented Dec 4, 2020

jreback commented Dec 4, 2020

ENH: context-manager for chunksize/iterator-reader #38225

ENH: context-manager for chunksize/iterator-reader #38225

Conversation

twoertwein commented Dec 2, 2020 • edited Loading

jreback commented Dec 2, 2020

twoertwein commented Dec 2, 2020 • edited Loading

jreback left a comment

Choose a reason for hiding this comment

twoertwein commented Dec 4, 2020

jreback commented Dec 4, 2020

twoertwein commented Dec 2, 2020 •

edited

Loading

twoertwein commented Dec 2, 2020 •

edited

Loading