Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting Parse Exception while reading CWR file #175

Open
avanishp2 opened this issue Jun 5, 2017 · 5 comments
Open

Getting Parse Exception while reading CWR file #175

avanishp2 opened this issue Jun 5, 2017 · 5 comments
Assignees
Labels

Comments

@avanishp2
Copy link

Hi,

I am reading CWR file through this Data-api library. I am getting the following exception.

File "C:\Python34\lib\site-packages\pyparsing.py", line 2794, in parseImpl
raise ParseException(instring, loc, self.errmsg, self)
pyparsing.ParseException: Expected sd_type (at char 114), (line:2, col:27)

Have tested on Python 2.7, Python 3.4 and Python 3.6. Getting same exception. Any help would be appreciated.

Here's the full stack trace of the program.

D:\python_web_crawler>python cwr-convertor.py
File to JSON test
Please enter the full path to a CWR file (e.g. c:/documents/file.cwr): D:/MusicWorksDB/CW160035UN_DIG.V21
Please enter the full path to the file where the results will be stored: D:/MusicWorksDB

Reading file D:/MusicWorksDB/CW160035UN_DIG.V21
Storing output on D:/MusicWorksDB

Traceback (most recent call last):
File "cwr-convertor.py", line 24, in
data = decoder.decode(data)
File "C:\Python34\lib\site-packages\cwr\parser\decoder\file.py", line 305, in decode
transmission = self._file_decoder.decode(data['contents'])[0]
File "C:\Python34\lib\site-packages\cwr\parser\decoder\common.py", line 90, in decode
return self._grammar.parseString(text)
File "C:\Python34\lib\site-packages\pyparsing.py", line 1632, in parseString
raise exc
File "C:\Python34\lib\site-packages\pyparsing.py", line 1622, in parseString
loc, tokens = self._parse( instring, 0 )
File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Python34\lib\site-packages\pyparsing.py", line 3395, in parseImpl
loc, exprtokens = e._parse( instring, loc, doActions )
File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Python34\lib\site-packages\pyparsing.py", line 3378, in parseImpl
loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False )
File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Python34\lib\site-packages\pyparsing.py", line 3378, in parseImpl
loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False )
File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Python34\lib\site-packages\pyparsing.py", line 3378, in parseImpl
loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False )
File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Python34\lib\site-packages\pyparsing.py", line 3395, in parseImpl
loc, exprtokens = e._parse( instring, loc, doActions )
File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Python34\lib\site-packages\pyparsing.py", line 3545, in parseImpl
raise maxException
File "C:\Python34\lib\site-packages\pyparsing.py", line 3530, in parseImpl
ret = e._parse( instring, loc, doActions )
File "C:\Python34\lib\site-packages\pyparsing.py", line 1383, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "C:\Python34\lib\site-packages\pyparsing.py", line 2794, in parseImpl
raise ParseException(instring, loc, self.errmsg, self)
pyparsing.ParseException: Expected sd_type (at char 114), (line:2, col:27)

@Bernardo-MG
Copy link
Collaborator

Bernardo-MG commented Jun 5, 2017 via email

@Bernardo-MG Bernardo-MG self-assigned this Jun 5, 2017
@Bernardo-MG Bernardo-MG added the bug label Jun 5, 2017
@Bernardo-MG
Copy link
Collaborator

Sorry I couldn't take a look sooner.

For what I can gather a line in the file is missing the SD Type, composed of two alphanumeric letters at the end of a group header (a GRH row).

Could you please verify that?

The parser is very strict, so something like that can break the parsing.

@avanishp2
Copy link
Author

I have provided different CWR file including the one you have included in tests/example folder as input. Then also I am getting the same error.
There is a note in CWR functional document that states
"Submission / Distribution Type is used only in the case of audio-visual transactions. This field
will be ignored for CWR transactions
".

SD_Type is a non mandatory field according to CWR functional document.

@Bernardo-MG
Copy link
Collaborator

I've set the SD type as optional, also uploaded a new version to Pypi with the latest changes. Could you try it now?

Sorry it is taking so long, but I do this on my spare time.

@Bernardo-MG
Copy link
Collaborator

After taking a better look, the problem won't be solved in the short term. There are some problems with the grammar used by the parser and acknowledgement files, which are related to this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants