Before #4854 we only had text/tab-separated-values and in retrospect, I don't think we ever should have added text/tsv since it's not listed https://www.iana.org/assignments/media-types/media-types.xhtml
Here's where text/tsv is used in the code base, as of d8a55a9
src/test/java/edu/harvard/iq/dataverse/api/FileTypeDetectionIT.java
106: .body("data.files[0].dataFile.contentType", equalTo("text/tsv"))
src/main/java/edu/harvard/iq/dataverse/util/FileUtil.java
141: STATISTICAL_FILE_EXTENSION.put("tsv", "text/tsv");
154: public static final String MIME_TYPE_TSV = "text/tsv";
843: } else if (fileType.equalsIgnoreCase("text/tsv") || fileType.equalsIgnoreCase("text/tab-separated-values")) {
src/main/java/edu/harvard/iq/dataverse/ingest/IngestServiceBean.java
496: // "text/tsv" should be used instead:
src/main/java/edu/harvard/iq/dataverse/dataaccess/StoredOriginalFile.java
112: } else if (fileType.equalsIgnoreCase("text/tsv") || fileType.equalsIgnoreCase("text/tab-separated-values")) {
src/main/java/edu/harvard/iq/dataverse/DataFileServiceBean.java
87: private static final String MIME_TYPE_TSV = "text/tsv";
src/main/java/META-INF/mime.types
10:text/tsv tab TAB tsv TSV
src/main/java/propertyFiles/MimeTypeDisplay.properties
87:text/tsv=Tab-Separated Values
src/main/java/propertyFiles/MimeTypeFacets.properties
85:text/tsv=Data
In particular, we should look at IngestServiceBean.java. Here's a bit more of the code added in #6517:
} else if (FileUtil.MIME_TYPE_INGESTED_FILE.equals(dataFile.getContentType())) {
// Make sure no *uningested* tab-delimited files are saved with the type "text/tab-separated-values"!
// "text/tsv" should be used instead:
dataFile.setContentType(FileUtil.MIME_TYPE_TSV);
}
Also related:
Some context for this issue is that I was considering making a pull request to add text/tsv as an alternative to text/tab-separated-values at https://github.com/mlcommons/croissant/blob/v1.0.17/python/mlcroissant/mlcroissant/_src/operation_graph/operations/read.py#L126 but again, since text/tsv doesn't seem to be a legit format, I don't think I should.
Before #4854 we only had
text/tab-separated-valuesand in retrospect, I don't think we ever should have addedtext/tsvsince it's not listed https://www.iana.org/assignments/media-types/media-types.xhtmlHere's where
text/tsvis used in the code base, as of d8a55a9In particular, we should look at IngestServiceBean.java. Here's a bit more of the code added in #6517:
Also related:
Some context for this issue is that I was considering making a pull request to add
text/tsvas an alternative totext/tab-separated-valuesat https://github.com/mlcommons/croissant/blob/v1.0.17/python/mlcroissant/mlcroissant/_src/operation_graph/operations/read.py#L126 but again, sincetext/tsvdoesn't seem to be a legit format, I don't think I should.