Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Command-line-parameter for alternate Mime type mappings #136

Closed
josundt opened this issue Dec 4, 2018 · 52 comments
Closed

Command-line-parameter for alternate Mime type mappings #136

josundt opened this issue Dec 4, 2018 · 52 comments
Assignees

Comments

@josundt
Copy link

josundt commented Dec 4, 2018

Which version of the AzCopy was used?

8.10-netcore

Which platform are you using? (ex: Windows, Mac, Linux)

Windows

Azure DevOps has a predefined Build/Release task definition built on AzCopy to publish/copy files or complete folders to Azure Blob Storage.

https://github.com/Microsoft/azure-pipelines-tasks/tree/master/Tasks/AzureFileCopyV2

We found that we can't use this task since it does not allow us to specify the file extension to content-type mapping for the individual files ourselves, and even in v8.10 of AzCopy the mapping that comes in AzCopyConfig.json is not complete.

There is also no universal truth about what is the correct content-type for a file extension, this of course ideally should be decided by the owner of the storage account.

If AzCopy just had a command-line switch to specify path to AzCopyConfig.json, we could use our own mapping file and the Azure DevOps task would be working for us.

Would it be possible to add this?

@zezha-msft
Copy link
Contributor

Hi @josundt, thanks for reaching out!

If sticking to AzCopy V8 is a mandatory requirement, then @EmmaZhu could perhaps chime in to help with your issue.

However, if you don't mind switching to the V10, we have this working as of today. Here is the documentation:

Please note that AzCopy automatically detects the Content-Type of files when uploading from local disk, based on file extension or file content(if no extension).

The built-in lookup table is small but on unix it is augmented by the local system's mime.types file(s) if available under one or more of these names:

  • /etc/mime.types
  • /etc/apache2/mime.types
  • /etc/apache/mime.types

On Windows, MIME types are extracted from the registry. This feature can be turned off with the help of a flag. Please refer to the flag section.

@IGx89
Copy link

IGx89 commented Jun 5, 2019

Apologies for commenting on an old issue, but what happened to the quoted documentation? All documentation on MIME types/Content-Type for AzCopy appears to have been scrubbed.

@zezha-msft
Copy link
Contributor

Hi @IGx89, no worries. To clarify, are you talking about V8 documentation?

@IGx89
Copy link

IGx89 commented Jun 6, 2019

V10 -- your quote above appears to be a snippet of V10 documentation about setting content type/MIME type, but when Binging ;) for the source document it only brings up this issue.

@JohnRusk
Copy link
Member

JohnRusk commented Jun 6, 2019

@normesta Did we lose the mime type docs in the recent refactoring of the docs? Shall we reinstate it as a FAQ entry perhaps?

@josundt
Copy link
Author

josundt commented Jun 14, 2019

So there will no longer be a way to explicitly define the extension/mime-type mapping with AzCopy?
That is sad, because operating systems are no source of universal truth to this.

@zezha-msft
Copy link
Contributor

Hi @josundt, do you mean specifying mime-type for each individual file instead of for file extensions in general?

@josundt
Copy link
Author

josundt commented Jun 15, 2019

It is essential for us to provide our own mappings for extension/mime-type when batch uploading. Relying on the OS mappings will not be suficcient.
I hope for an cmd line argument where we can specify path to a json file with the mappings.

@JohnRusk
Copy link
Member

@josundt Just to clarify, there are two ways that AzCopy v10 can currently get Mime Types: from the OS, or by a command line parameter that specifies one Content-Type for all the files in the upload job.

We don't currently have a command line argument that can specify a path to a json file with the mappings.

@JohnRusk
Copy link
Member

Also, the "docs" extract above is actually from the in-app help, that AzCopy v10 displays with the "azcopy copy --help" command. It hasn't been scrubbed. It's still there.

@josundt, at this stage, given that we don't have a path with the mappings, the only workaround I can think of for you, with AzCopyV10, is to rely on the files noted in that doc: e.g. /etc/mime.types If you're running on Linux you can use that directly. If you're running on Windows, you could possibly consider using Windows Subsystem for Linux (WSL) and running the Linux version of AzCopy v10 inside WSL. Sorry we don't have a cleaner workaround for you at this stage.

@josundt
Copy link
Author

josundt commented Jun 16, 2019

Some more background to make myself clearer:

We have attempted to use AzCopy to recursively copy/upload a local folder to blob storage from an Azure DevOps Pipeline Task. The goal is to upload new static content to our Azure CDN's backing store.

The static files consist of style sheet libraries, font libraries, javascript libraries and json files organized into multiple levels of subfolders (including version folders).

File formats inlcude .css/.map/.scss for styles sheets, .otf/.ttf/.woff/.woff2/.eot/.svg for web fonts, .js/,map/.ts files for javascript libraries, and .json files.

We have faced limitations that we can't seem get around with AzCopy; we can't get the content-type correctly set on the target blobs, since the mime mapping table in the build agent OS registry is incomplete/incorrect in the context of our copy job.

The main point that I tried to emphasize in my earlier comments is that there's no such thing as a universally correct mime-type for a file extension.

Example:
A .ts file could in some contexts mean a "Transport Stream" mulitmedia file format.
In other contexts it could mean "TypeScript source file".

For this reason I believe that a "true" mapping table is NOT logically scoped to:

  • The operating system where AzCopy runs (AzCopy v10).
  • The AzCopy.exe working directory (AzCopy v8).

In some scenarios (like ours); full control over this mapping table is needed for the single execution of the AzCopy command.

AzCopy had something close to what I request in AzCopy version 8; it was possible to configure mappings in a JSON file, but this file needed the name "AzCopyConfig.json" and it needed to exist in the AzCopy working directory.

What I want is the possibility to use a file similar to "AzCopyConfig.json", but to tell AzCopy where to find it (the path). (In our case, it would make sense to keep this file in in our git repository which is checked out to a given path on the Azure DevOps build agent).

I think what I am trying to accomplish here is not at all an edge case usage of AzCopy, it is a rather straight-forward common copy job that I believe AzCopy needs to support. I would be surprised if no other people have the same need and would benefit from this feature.

This is why you should consider this a feature request for a new command line argument:

/ContentTypeMappings:<path-to-mappings-file>

If this argument is omitted, the OS mappings should be used like today.

@JohnRusk
Copy link
Member

Thanks for the clarification @josundt, it's useful. I have logged it as a feature request in our product backlog. Unfortunately I can't give you any expected date by which we might look at it. I'm not trying to be difficult ... just aware of how much stuff is already there!

In the meantime, I wonder if there's any chance you can script modification or replacement of the v8 json file, at the time your build runs... Not ideal, I know...

@josundt
Copy link
Author

josundt commented Jun 17, 2019

That's actually what we are doing at the moment. We keep the full AzCopy folder (241 files) in our Git Repo so that we can use our custom AzCopyConfig.json file.

This works but is of course far from an ideal solution.

Hoping my real-life example make you realize that this will become a limitation for many others as well, and that you will prioritize the feature request thereafter.

@JohnRusk
Copy link
Member

Glad to hear you have a workaround at present. And thanks again for your input to our prioritization.

@josundt
Copy link
Author

josundt commented Jul 8, 2019

How can I follow the progress of the backlog item?

@JohnRusk JohnRusk changed the title Command-line-parameter for alternate path AzCopyConfig.json Command-line-parameter for alternate Mime type mappings Jul 8, 2019
@JohnRusk
Copy link
Member

JohnRusk commented Jul 8, 2019

Our project management pipeline (for sprints etc) is in our internal DevOps instance, and that's where the backlog is so I can't give you a link to it, sorry.

We have work roughly planned out for the next 4 months and, I'm sorry to say, "your" backlog item isn't included in those 4 months. Currently it's sitting in our "un-triaged" bucket, which means we have not yet committed that we will actually do it (ever), and if we do, it will be at least 4 months until we start. I feel bad saying that to you, because obviously you'd like us to look at it sooner :-) But the reality is that we need to be fairly "brutal" about prioritization and it's hard to justify earlier consideration of an issue where there is actually a workaround. (Even though we agree that your workaround with v8 in your Git repo is less than ideal).

I have marked your issue with a tag that means "this may be more important than other 'untriaged' requests". That will draw our attention to it when we start planning beyond the 4 month period I've already mentioned. But even that tag does not guarantee any particular outcome.

@JohnRusk JohnRusk reopened this Jul 8, 2019
@think-john
Copy link

Came here to vote for this issue.

I've got a simple proof-of-concept site consisting of one HTML file, one CSS file, one JS file, and one .svg image.

The site is deployed to Azure blob storage using BitBucket Pipelines, which uses the Azure Deploy Pipe.

The pipe had a habit of uploading the .svg file as "application/octet-stream", when it needed to be "image/svg+xml".

I was able to get the Azure Deploy pipe updated so that it now uses azcopy v10 -- up from azcopy v7.3 previously.

However, azcopy v10 has the same issue, transferring an .svg as "application/octet-stream". Since the pipe is in a Docker container that's not defined by me, it's not really feasible for me to set .json in that container's /etc/ folder.

A way to manually set MIME types by passing parameters to a configuration would be great!

Most of the above is just to point out that I'm not doing anything especially arcane; I'm just deploying a simple static site, but azcopy is stomping on one of my "regular" files.

@JohnRusk
Copy link
Member

Is it just the svg's that you have the problem on? If so, there's a slightly awkward workaround that you can probably use for now (until we fix it properly, one day). It goes like this:

  1. Upload all your files (it's OK if the SVGs are included here)
  2. Now, run AzCopy v10 again with the --content-type parameter set to "image/svg+xml" and set the other parameters such that only the SVG files are uploaded. (The upcoming v10.3 has a nice --include-file *.svg syntax for this. If you can wait a small number of weeks, that's your best option. Otherwise experiment with wildcards in the source path in v10.2.1. If you are using v10.2.1, you'll also need to add --no-guess-mime-type, because otherwise there's a bug where inference incorrectly overrides --content-type).

Step 2 overwrites the SVGs of step 1, with the same file content, but the right content type.

Obviously this workaround only works if you need it for a very small number of files types.

As noted somewhere (far) above in this thread, the --content-type parameter applies one content type to all files in the azCopy job. Hence the need to filter to SVG files when specifying that parameter.

@JohnRusk
Copy link
Member

BTW, I can't remember how source path wildcards interact with --recursive in v10.2.1. But I suspect the answer is that they don't. In which case you should probably wait for 10.3 if your SVGs are scattered across subdirectories.

@JohnRusk
Copy link
Member

Correction, the v10.3 parameter that I mentioned above will probably be called --include-pattern

@think-john
Copy link

It is indeed just the SVG that I've had an issue with -- and the whole point of this "toy" site being deployed to multiple places is to flush out issues like this and find out which cloud providers are responsive and helpful. Thanks for this highly responsive and helpful, er... response!

What you are describing as a workaround makes sense: copy twice, with the second time JUST transferring the .svg and using a "set content type" on that copy action. I'll give that a shot.

I'll also subscribe to the changes for azcopy. Thanks!

@josundt
Copy link
Author

josundt commented Sep 21, 2019

I still think AzCopy v8 was the closest you have been to a good solution to this. In v8 you were allowed to create your custom extension/mime mappings by modifying AzCopyConfig.json (ref. my commented from Jun 17).

I want that v8 feature back but with the ability to specify the path to the JSON file through command line arguments.

I currently need to keep all files for AzCopy v8 in my source repo with a custom AzCopyConfig.json to make the blob publishing task work the way I want through my Azure DevOps release pipeline.

I am surprised that this is not prioritized as it must be an obstacle for all users of AzCopy that care at all about controlling the Content-Type response headers in blob responses after uploading them. And since you have already more or less implemented what I asked for in v8, it must be a quite small investment.

@JohnRusk
Copy link
Member

Hi @josundt. We've had to make some tough decisions about prioritization, since there are lots of things that lots of people are asking for :-) As you may know, v10 is a totally new codebase, so things that exist in v8 don't necessarily provide us with any shortcuts in v10.

@josundt
Copy link
Author

josundt commented Sep 22, 2019

OK. I'll keep patiently but expectantly waiting :-)

@JohnRusk
Copy link
Member

Good points. I don't know the answer re etc/mime.types, sorry. Thanks for outlining how and why this is important. At this stage, I don't have any update on likely scheduling of any change, sorry.

@benc-uk
Copy link

benc-uk commented Nov 18, 2019

Being able to control MIME types would be really useful, well, actually essential.

I've discovered that AzCopy doesn't understand the .mjs extension is javascript which breaks my app (they get uploaded as text/plain and the browser won't accept that)

@evandigby
Copy link

We would be interested in this feature as well purely because .js extension in windows doesn't default to the application/javascript (or x-javascript) mime type. (My guess is that it's a way to help prevent script execution somehow? shrug)

We migrated from linux -> windows with the same version of azcopy and ended up breaking a release because .js started uploading as text/plain and service workers won't load if the mime type isn't correct.

Once I dug into the docs as noted above the workaround I came up with was to do 2 passes: 1) just .js files with --content-type application/javascript and one for "the rest" which seems to map the same on windows as it did before.

The downside of this is that some builds don't have non-js files so we need to detect that and not run azcopy on the "other" pass when there are no non-js files because azcopy exits nonzero if no copies are scheduled.

This isn't optimal but it works!

All that said, I second a feature to provide a CLI option to map mime types :)

@rickedwards2001
Copy link

I'd like to add to this as I've just hit the javascript file mime type issue. I originally used the latest storage explorer to upload a static website to the $web blob container but it sets the mime type of any javascript file to plain/text which then throws an error in Chrome as the mime type is incorrect (this is using azcopy v10.3.3). I get the same issue running azcopy v10.4 from the command line BUT if I upload the file through the Azure portal it does set the type correctly. The problem here is that the portal only allows you to upload one file at a time (from what I can see) which is useless for uploading an entire website. Doing a 2 pass copy is just a massive pain and given uploading and running a static web site to the blob store is an advertised feature I would expect it to work with Javascript files correctly. It's very frustrating.

@luanbon
Copy link

luanbon commented Apr 15, 2020

@rickedwards2001 I am facing exactly the same problem as you, in addition my CDN rules are not working because the content type of the files are not setted properly.

@JohnRusk
Copy link
Member

Thanks folks for the updates. We are planning the fix some of these issues (most importantly the JavaScript ones) in an upcoming release.

I'd suggest you subscribe to updates, by using the "watch" button at the top of this page, and choosing to be notified when there are new releases.

@capraynor
Copy link

capraynor commented May 29, 2020

Importing the following reg items can be a temporary solution.

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\.js]
"Content Type"="text/javascript"

[HKEY_LOCAL_MACHINE\SOFTWARE\Classes\.js]
"Content Type"="text/javascript"

@josundt
Copy link
Author

josundt commented May 29, 2020

@capraynor I can't do that inside an Azure DevOps pipeline

@rgwood
Copy link

rgwood commented Jun 2, 2020

This is also an issue for MSIX app deployment; specific MIME types are required for the various MSIX file extensions.

@mmulhearn
Copy link

I'm running into the JS text/plain issue. I'm deploying an Angular site via Azure DevOps pipelines and using the aforementioned Azure Copy Files task, which relies on AzCopy. All JS files are uploading as text/plain and Chrome/Chromium Edge refuses to run it.

@javafrog
Copy link

What's the status on this? This issue basically makes it impossible to use Azure File Copy to efficiently deploy SPAs to a Blob Storage (like proposed here https://docs.microsoft.com/en-us/azure/api-management/howto-protect-backend-frontend-azure-ad-b2c#upload-the-js-spa-sample).

@Maetiz
Copy link

Maetiz commented Oct 20, 2020

It's not pretty but if you use the V4 of the Azure File Copy task in Azure DevOps you can use the --content-type="application/javascript;charset=utf-8" parameter. You just need to have another step to deploy other files or other MIME types.

@mmulhearn
Copy link

I ended up creating a specific proxy rule through my Azure Function that I use to proxy my SPA. Not stoked about it but it works.

@javafrog
Copy link

It's not pretty but if you use the V4 of the Azure File Copy task in Azure DevOps you can use the --content-type="application/javascript;charset=utf-8" parameter. You just need to have another step to deploy other files or other MIME types.

Okay, thanks, that's the way I did it now as well. I also added a job for correctly uploading *.json files and made a task group of the three different jobs.
Important to note: using path/*.js to select the JS files does not work correctly if JS files are in a subdirectory. It is better to use the parameters --content-type "application/javascript" --recursive --include-pattern '*.js' instead.

@mmulhearn
Copy link

@javafrog Does the pattern path/**/*.js work?

@javafrog
Copy link

javafrog commented Oct 20, 2020

@javafrog Does the pattern path/**/*.js work?

I thought about trying that as well. But it works now and I don't want to fiddle with it anymore ;)

@josundt
Copy link
Author

josundt commented Oct 20, 2020

I would like to direct the attention here to the original feature request.

To me it is bleeding obvious that batch copy jobs should be able to specify their own file extension to mime type mapping tables, because different batch copy jobs may need different mapping tables.
To me its really strange that this is not identified as a important missing feature and prioritized thereafter.

With AzCopy v8 this was possible using a json file. After that is was removed and has to my knowledge never come back.
I still need to use AzCopy v8 since this feature was requested close to 2 years ago.

@mrlund
Copy link

mrlund commented Nov 4, 2020

This seems like an odd step back, since this has been working for me for quite some time.

For those on Azure Pipelines, I just tried downgrading the Azure File Copy task to v2, and can confirm that the .js mime types are set correctly. I hope this can help some people stuck on this issue.

@mark-at-tusksoft
Copy link

@mrlund Switching to AzureFileCopy@2 resulted in the following error [ERROR] The TLS version of the connection is not permitted on this storage account.
We are using a service connection azureSubscription: SubscriptionName(guid) if that information helps.
Version 4 seems to identify *.css and *.html files just fine and applies the correct mime types but js files are just text/plain.

@daerogami
Copy link

daerogami commented Jan 27, 2021

Apparently azure static websites can serve files with mime-types based on extension
There is a link to an example file that explains everything. Will test in the near future. Can't say I'm inspired to alter my pipeline and run it again at the end of my day. Will circle back with results.
If it works, that makes the AzCopy mime-type issue go away for my use-case.

Update: I removed the duplicate copy step after adding the explicit mime-type for the extension and it did not fix the issue. I'm still poking around to see if I can figure out why ASW is being so stubborn about the file's mime-type, may have to go back to the duplicate copy 😔.

@javafrog
Copy link

Apparently azure static websites can serve files with mime-types based on extension
There is a link to an example file that explains everything. Will test in the near future. Can't say I'm inspired to alter my pipeline and run it again at the end of my day. Will circle back with results.
If it works, that makes the AzCopy mime-type issue go away for my use-case.

Good find! Actually, the "experience" for hosting SPA content on Azure has been suboptimal so far, so that new service really hits a need. As long as it is still preview, however, we won't be using it.

Regardless of any of that, this issue here still persists and should be fixed, for other use cases as well.

@kelvinyankey6
Copy link

Okay, thanks, that's the way I did it now as well. I also added a job for correctly uploading *.json files and made a task group of the three different jobs.
Important to note: using path/*.js to select the JS files does not work correctly if JS files are in a subdirectory. It is better to use the parameters --content-type "application/javascript" --recursive --include-pattern '*.js' instead.

Im a bit late to the conversation but I also went with this strategy for uploading my SPA to get the service workers well, working. I think with spas becoming more popular, giving us the opportunity to declare our own mime types is a worthwhile feature

@zezha-msft
Copy link
Contributor

This feature was added in v10.11.

@josundt
Copy link
Author

josundt commented Jul 20, 2021

@zezha-msft Is there any documentation of how to use the new AZCOPY_CONTENT_TYPE_MAP environment variable? Is the value supposed to be the complete map, or just the path to a mapping file?

@zezha-msft
Copy link
Contributor

@josundt sorry my bad, here is the wiki link: https://github.com/Azure/azure-storage-azcopy/wiki/Custom-mime-mapping

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests