Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-ascii field names #1008

Closed
Vanuan opened this issue Jan 12, 2015 · 2 comments · Fixed by #3696
Closed

Non-ascii field names #1008

Vanuan opened this issue Jan 12, 2015 · 2 comments · Fixed by #3696

Comments

@Vanuan
Copy link

Vanuan commented Jan 12, 2015

In Python 2 variable names are limited by ascii character set.
CSV Scrapy exporter uses Item's field names as an csv title row.
I want to use unicode characters for csv title row.
Can I do that without writing my own exporter?

@ghost
Copy link

ghost commented Jan 18, 2015

I think, the easiest way is - I often see this as the easiest way, so maybe I am wrong - to extend the existing class like this (untested, for reference):

class UnicodeCsvItemExporter(CsvItemExporter):
    def __init__(self, file, include_headers_line=True, join_multivalued=',', headers={}, **kwargs):
        super(UnicodeCsvItemExporter, self).__init__(file, include_headers_line, join_multivalued, **kwargs)

        self.header_mapping = headers

    def _write_headers_and_set_fields_to_export(self, item):
        if self.include_headers_line:
            if not self.fields_to_export:
                self.fields_to_export = item.fields.keys()
            # Maybe list(...) is required here to give writerow a list and not a lazy iterable
            self.csv_writer.writerow(map(lambda x: self.header_mapping[x], self.fields_to_export))

Then, you'd have to add your exporter to the settings:

FEED_FORMAT = 'csv_unicode'
FEED_EXPORTERS = {
    'csv_unicode': '.....',
}

Edit:
mmhh... or maybe not. You will not have access to the headers parameter through the string settings... If you find a solution for this, then it will be fine. Otherwise, it's not a solution.
My idea here would be to add another setting in settings.py called CSV_UNICODE_HEADERS, but it seems exporters are one of the few places not having a reference of settings.

I am now really wondering, where scrapy generates an instance of the selected exporter, because each class has a totally different constructor signature.

@nramirezuy
Copy link
Contributor

@Vanuan Extend this method to pull the header name from the item.Field.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants