-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deleted Records. #38
Comments
thanks for reporting the issue @PCaff. I don't think parso supports deleted records, do you think the error you face is related to this? if you have some test examples, it would be great if you could share those, and we can even include them to our test suite. |
I'm trying to replicate the issue with the test data provided in this repo @printsev . I'll hopefully get a commit/PR in the next hour or so tonight. Currently, I am able to produce a deleted record in the test data. This doesn't replicate my issue but I am convinced it is related. It seems parso ignores a page that contains deleted records. Using all_rand_normal.sas7bdat with a deleted record, I receive 0 columns and 0 rows. |
Actually, to make it easier. I'll just post the file here: |
It also looks like this issue may encompass records split between pages. If someone can point me to the right place in the logic, I can try to enhance it. |
Will be running tests this week on new logic related to this issue. New pages were found that contain deleted records. The logic for mapping the deleted markers to the records is included. |
Merged to master so closing the issue. |
I'm one of the users that has experienced issues while using this in spark.
I continually get ArrayIndexOutOfBounds Errors which is similar to the other issues that users have reported while using the spark version. I recently cloned this repo and made a quick program to read in one of these error files using just parso. The error persisted (though it is a different stack trace).
The issue (I believe) is null records that are placed into the data from the readAll() method. So cycling through the 2D array object will clearly output a NullPointerException without proper checking.
An interesting observation I made was that the number of records I was able to read before one of these errors was that the number of null rows is very close to the number of deleted rows.
Does parso handle deleted rows? If not is there any logic that is to be implemented for the deleted records? This error can be fixed by just resaving the SAS file in a SAS program. However, when it comes to large files this takes a very long time.
The text was updated successfully, but these errors were encountered: