Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional tabular ingest skipping #8525

Closed
lubitchv opened this issue Mar 22, 2022 · 3 comments · Fixed by #8532
Closed

Optional tabular ingest skipping #8525

lubitchv opened this issue Mar 22, 2022 · 3 comments · Fixed by #8532
Milestone

Comments

@lubitchv
Copy link
Contributor

We have a pilot project of transitioning from Nesstar to Dataverse. One of the features of Nesstar is the ability to convert files into different formats on the fly, such as SPSS, STATA etc. Dataverse does not have this feature, but our users really would like to have all these different formats. To get around this we plan to import all these formats from Nesstar. The problem is all these files will go through tabular ingest, but we would like that only one (SPSS) file would go through tabular ingest but other files in other formats such as STATA would be added to datasets without ingest.

Therefore it would be nice to have an optional parameter (tabularIngest=true/false) in add file to dataset API. Maybe this parameter would be passed down to saveAndAddFilesToDataset function in IngestServiceBean and maybe something like

if (FileUtil.canIngestAsTabular(dataFile))
replaced with
if (FileUtil.canIngestAsTabular(dataFile) && tabularIngest)

What would be the thoughts regarding it?

@qqmyers
Copy link
Member

qqmyers commented Mar 22, 2022

Have you considered the aux file API? If you add your alternate formats that way they appear in the download menu but would not be separate, first-class datafiles in the file list and wouldn't undergo ingest, etc. FWIW: We're doing ~this at QDR where we're converting docx to pdf and storing the pdf as an aux file.

@landreev
Copy link
Contributor

Uploading alternative formats as auxiliary files may be a perfectly good solution for this use case. But I would very much like to finally implement this "skip option" flag/parameter, both for the API and for the UI, regardless.

@lubitchv
Copy link
Contributor Author

lubitchv commented Mar 22, 2022

We will need such feature for add file API soon. I can make a pull request for add API with parameter with existing code, if it is ok with everyone. Otherwise I will look into auxiliary files or add/uningest and will discuss it with my group.

pdurbin added a commit to lubitchv/dataverse that referenced this issue Apr 12, 2022
pdurbin added a commit to lubitchv/dataverse that referenced this issue Apr 12, 2022
@pdurbin pdurbin added this to the 5.11 milestone Apr 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants