Skip to content

Temporary CSV files generated have control M characters #175

@OmkarPathak

Description

@OmkarPathak

We are trying to load data from Oracle to Bigquery. However, temporary CSV files that are generated contain control M characters which results in newline. Bugquery hence cannot process this csv file. Are there any filters to get rid of such cases? Any help would be appreciated.
Note: Specified tables are around 15+ GBs in size.
Sample config.yml:

in:
  type: oracle
  driver_path: /ojdbc7-12.1.0.2.jar
  url: jdbc:oracle:thin:@something.com:1526/DB
  user: user
  password: "password"
  query: "SELECT * SCHEMA.TABLE"
  fetch_rows: 4000
  connect_timeout: 100
  formatter:
    type: csv
    delimiter: ","
    newline: CR
    newline_in_field: CR
    escape: "\\"
    null_string: "\\N"
out:
   type: bigquery
   mode: replace
   auth_method: service_account
   project: project
   dataset: "dataset"
   table: "table"
   location: europe-west2
   json_keyfile: credentials.json
   allow_quoted_newlines: true
   abort_on_error: false
   delete_from_local_when_job_end: false

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions