Skip to content

Conversation

@maguec
Copy link
Contributor

@maguec maguec commented Mar 18, 2019

Handle the case where there are quotes around an entry that cause the following error

Traceback (most recent call last):                                                                                                                                                     
  File "bulk_insert.py", line 399, in <module>                                                                                                                                         
    bulk_insert()                                                                                                                                                                      
  File "/home/chris/.local/lib/python3.6/site-packages/click/core.py", line 764, in __call__                                                                                           
    return self.main(*args, **kwargs)                                                                                                                                                  
  File "/home/chris/.local/lib/python3.6/site-packages/click/core.py", line 717, in main                                                                                               
    rv = self.invoke(ctx)                                                                                                                                                              
  File "/home/chris/.local/lib/python3.6/site-packages/click/core.py", line 956, in invoke                                                                                             
    return ctx.invoke(self.callback, **ctx.params)                                                                                                                                     
  File "/home/chris/.local/lib/python3.6/site-packages/click/core.py", line 555, in invoke                                                                                             
    return callback(*args, **kwargs)                                                                                                                                                   
  File "bulk_insert.py", line 387, in bulk_insert                                                                                                                                      
    process_entity_csvs(Label, nodes)                                                                                                                                                  
  File "bulk_insert.py", line 307, in process_entity_csvs                                                                                                                              
    entity = cls(in_csv)                                                                                                                                                               
  File "bulk_insert.py", line 169, in __init__                                                                                                                                         
    self.process_entities(expected_col_count)                                                                                                                                          
  File "bulk_insert.py", line 192, in process_entities                                                                                                                                 
    self.validate_row(expected_col_count, row)                                                                                                                                         
  File "bulk_insert.py", line 133, in validate_row                                                                                                                                     
    % (self.infile.name, self.reader.line_num, expected_col_count, len(row), ','.join(row)))                                                                                           
__main__.CSVError: Movies.csv:49 Expected 2 columns, encountered 3 ('tt0000049,"Boxing Match; or,Glove Contest"')  ```

@jeffreylovitz jeffreylovitz self-requested a review March 18, 2019 20:51
bulk_insert.py Outdated
# Initialize CSV reader that ignores leading whitespace in each field
# and does not modify input quote characters
self.reader = csv.reader(self.infile, skipinitialspace=True, quoting=csv.QUOTE_NONE)
#self.reader = csv.reader(self.infile, skipinitialspace=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete commented line

bulk_insert.py Outdated
@click.option('--max-token-count', '-c', default=1024, help='max number of processed CSVs to send per query (default 1024)')
@click.option('--max-buffer-size', '-b', default=2048, help='max buffer size in megabytes (default 2048)')
@click.option('--max-token-size', '-t', default=500, help='max size of each token in megabytes (default 500, max 512)')
@click.option('--max-token-size', '-t', default=500, help='max size of each token in megabytes (default 500, max 512)')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate option line

@click.option('--max-buffer-size', '-b', default=2048, help='max buffer size in megabytes (default 2048)')
@click.option('--max-token-size', '-t', default=500, help='max size of each token in megabytes (default 500, max 512)')
@click.option('--max-token-size', '-t', default=500, help='max size of each token in megabytes (default 500, max 512)')
@click.option('--quote-minimal/--no-quote-minimal', '-q/-d', default=False, help='only quote those fields which contain special characters such as delimiter, quotechar or any of the characters in lineterminator')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the -d alternate is strictly necessary, but if you think it is helpful then I have no objection!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just kind of helpful for me as I'm lazy

Copy link
Contributor

@jeffreylovitz jeffreylovitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great addition, thanks @maguec !

Some minor notes in comments, but this addition works very well for me.

@maguec
Copy link
Contributor Author

maguec commented Mar 20, 2019

@jeffreylovitz removed the duplicate line and the commented line. pls review

@jeffreylovitz jeffreylovitz merged commit d88c20e into RedisGraph:master Mar 20, 2019
@jeffreylovitz
Copy link
Contributor

Thanks, @maguec!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants