-
Notifications
You must be signed in to change notification settings - Fork 86
Open
Labels
questionFurther information is requestedFurther information is requested
Description
Background
I am currently working on a project where a Cobol based system is using a MS SQL Server instance as its back end.
I am able to connect to the SQL server database via a JDBC connection which returns the table into a Spark Dataframe, however it is still encoded with EBCDIC encoding, which is an obvious problem when using AWS GLUE and wanting to post the data into parquet files for down stream processes. I am also able to parse the copybook via your copybook parser.
However, these two structures are vastly different, which are posing limitations to the process that I would like to build. I would still want to use your package as I believe there are inherent synergies.
Question
There are a few questions:
- Is there any advise that you can give me with regards my use case and using your package?
- Is there a way that I can just use your decoding technology while in flight, or after the data has landed in the dataframe?
- Is there a way to flatten the schema structure once the parser has completed?
Your assistance would be greatly appreciated.
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested