Append instead of overwrite #19

rachelvuu · 2019-07-08T19:59:28Z

Is there any way to append to the hyper extract rather than overwrite the file?

bwiley1 · 2019-07-10T13:05:24Z

Hi Rachel,

Thanks for reaching out! Last time I checked on this, I had some trouble trying to manipulate the .hyper or .tde files as it looked like writing functions in the tableausdk package were encrypted. Originally I had wanted to push out another version where you could convert a .tde or .hyper to a pandas dataframe, or otherwise manipulate the data between sources, but I had some trouble trying to do this. I agree though, it would be a cool functionality to add - I'll try to do more research on the issue. Thanks!

Best,
Ben

ghost · 2019-07-26T22:38:05Z

Hey @bwiley1 ,

I think the issue is around lines 139-154 in pandleau.py. This should be able to be abstracted and instead of creating a table definition from scratch, check to see if 'Extract' already exists, and if so, just set table_def to the definition that already exists.

At least in the 'old' SDK, tableauSDKSample.py has an example of this -- the procedure createOrOpenExtract() checks first if the table exists, otherwise it creates it. Then, the procedure populateExtract() gets the table schema using table.getTableDefinition()

However, I don't know how nicely this will play with the "add index" function of pandleau. TDEs (and I'm assuming hypers as well) aren't really meant to be read by anything but Tableau, and the SDK doesn't have any public reading functions that I'm aware of.

I should have some time when I get home to clone and play around and see if it's something that can be adjusted. I am making an assumption that these functions exist in both SDK and SDK2, but I guess I'll find out!

bwiley1 · 2019-07-27T00:00:04Z

That's very true... I think using createOrOpenExtract would also solve writing multiple tables to a single extract (another issue on this list). That would be cool if you figure it out! Let me know if there's anything I can help out with!

ghost · 2019-07-30T00:22:21Z

@bwiley1 Check it out here: https://github.com/harrison-h/pandleau/tree/load-existing-table

Was pretty straightforward. I've only tested it on the legacy SDK (as it's what I have for my use case) but it works exactly as intended. The use case I have is that I'm transforming very large datasets in a way to feed them into Tableau, so I end up having to pass it along to the extract in chunks.

Additionally, I think you're right about it allowing you to write multiple tables! As long as that argument is passed, it should work just fine. Also not tested though.

As full disclosure, I'm not in CS or anything, so feel free to point out anything in my code that could be better or improved upon. If it looks all good to you as well I can open a pull request.

bwiley1 · 2019-07-30T13:19:34Z

I think this looks fine! If you want to open a pull request I'll approve it, thanks!

bwiley1 closed this as completed Jul 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Append instead of overwrite #19

Append instead of overwrite #19

rachelvuu commented Jul 8, 2019

bwiley1 commented Jul 10, 2019

ghost commented Jul 26, 2019

bwiley1 commented Jul 27, 2019

ghost commented Jul 30, 2019

bwiley1 commented Jul 30, 2019

Append instead of overwrite #19

Append instead of overwrite #19

Comments

rachelvuu commented Jul 8, 2019

bwiley1 commented Jul 10, 2019

ghost commented Jul 26, 2019

bwiley1 commented Jul 27, 2019

ghost commented Jul 30, 2019

bwiley1 commented Jul 30, 2019