Replies: 2 comments 3 replies
-
|
It's worth trying With the recent activity around async task frameworks it would be fairly simple to move csv upload to an async process and email the user once it's done. |
Beta Was this translation helpful? Give feedback.
-
|
I added the last line here to my mssql.py: But it's not causing the So either I added it in the wrong place for it to take effect, or the codebase is broken and it doesn't properly ingest and apply this setting... |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Summary: uploading a spreadsheet to the data warehouse via Superset is unsatisfactory in two ways:
This post is about (1) but either way, there should be a better user experience to address (2). Could it be handled by a worker instead of the main Superset application?
Specifics:
My users think that their spreadsheet uploads fail because they get a gateway error message after 30 or 60 seconds. I benchmarked an upload of an .xlsx file using Superset 4.1.0rc2. It is 182kb file size with just over 5k rows and 4 columns of data -- 1 date and 3 float columns. Not a huge file.
Benchmarking:
to_sql,method = 'multi': 16 secondsto_sqlandmethod = None: 250 secondsPossible fix:
Poking around in the code base, it looks like if I changed the MSSQL db engine spec, I could get Superset to write with
method = 'multi', see here:superset/superset/db_engine_specs/base.py
Line 1286 in 5e42d7a
I guess I could just add
supports_multivalues_insert = Trueto the SQL Server spec?But I'm confused, why then does
supports_multivalues_insertnot appear in a single existing db engine spec as far as I see? The only idea I have is that all of the commonly-used DBs by developers override thedf_to_sqlfunction so don't care about this variable?Beta Was this translation helpful? Give feedback.
All reactions