-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
Good morning,
Using Python 3.6.
My problem seems to be close to issue #14734 but with a different error type though. However, please forgive my lack of competence, but I am not able to understand 1. if my issue is really similar and 2. if the issue that seemed to surround some sas files has been solved or if there might still be some probleùs with some sas files (that i cannot provide for reasons detailed just below).
I have read the rules about posting but i cannot attach a sample of my data or reproduce the entire error message as the data i am working on is located on a server without access to internet. I apologize for this inconvenience. I’ll try to reproduce most of what is requested however below.
I am working with very big sas files (data on each job, hence millions of lines) and got memory error when i was trying to simple read them (they open fine in R or stata strangely). Therefore i searched and find the pandas.read_sas option to work with chunks of the data. My code is now the following:
import pandas as pd
df_chunk = pd.read_sas(r'file.sas7bdat', chunksize=500)
for chunk in df_chunk:
chunk_list.append(chunk)At this point i get the following error (I am reproducing it here manually as i cannot copy paste):
line 660, in _chunk_to_dataframe
if self.column_formats[j] in const.sas_date_formats:
IndexError: list index out of rangeI am aware the exposition of my issue is truncated and probably incomplete but many thanks for any help you could provide,
Axelle