Skip to content
This repository has been archived by the owner on Nov 22, 2022. It is now read-only.

Drop rows with insufficient columns in TSV data source #954

Closed
wants to merge 1 commit into from
Closed

Drop rows with insufficient columns in TSV data source #954

wants to merge 1 commit into from

Conversation

hikushalhere
Copy link
Contributor

Summary: We csv.DictReader when reading in TSV files which when reads rows with insufficient columns sets the trailing columns to None. This diff introduces a flag to ignore such rows. If not ignored, a tensorizer could throw KeyError in numberize() method since None is not a valid key.

Differential Revision: D17223777

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Sep 6, 2019
Summary:
Pull Request resolved: #954

We use `csv.DictReader` when reading in TSV files which when reads rows with insufficient number of columns, sets the trailing columns to None. This diff introduces a flag to ignore such rows. If not ignored, a tensorizer could throw KeyError in `numberize()` method since `None` is not a valid key.

Reviewed By: borguz

Differential Revision: D17223777

fbshipit-source-id: de21f5fe8ed4f2e06032f82f522e6539d6c48c3f
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 78dce2a.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed Do not delete this pull request or issue due to inactivity. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants