-
-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Image URLs without .JPG/.PNG Suffixes #51
Comments
Hi @CavalloScuro thanks for reaching out, trying the app and giving feedbacks about possible improvements. In your case urls without an extension brings a problem on saving to filesystem (with which filename and extension to save) but I think it can be handled with the server response header (via parsing the mime type) Can you supply a small set of image resources without file extensions to make the testing easier? Kind regards |
Dear Burak, Thank you for responding so quickly and so kindly to my inquiry. As I mentioned, I'm really impressed with your tool, which will be an enormous (invaluable, really) resource for my research. The URLs that I am dealing with right now look like this: http://digitale.bnc.roma.sbn.it/tecadigitale/img/giornale/TO00181645/1935/unico/00000001/original There is no way, that I know of, to get an absolute path to the jpg itself; I have only been able to find these URLs which are essentially HTML websites with the .jpg embedded inside of them. But when one performs the Save As function, it saves the file as a .jpg (at least in my browser). The problem, of course, is automation! I thank you for being willing to look into this solution. It means a lot. And thanks for your hard work. CS |
I just had an idea: I wonder if I could just put the .jpg extensions in the "FileName" column of the spreadsheet, which would then become the full file name when the Parser assigns the extensionless file a file name. Am I off here, or could this be a possible solution? |
Yeah that could work of course; But that could require some extra work to prepare the excel file in cases where you can not guess the extension or if the resource changes from png to jpg on the server, or if they serve the image with an image CDN they can check your request headers and send webP, avif or any other modern extension due to your headers( in this case embedded chromiums user agent), so the most convenient way seems to rely on the server response's mime type and save with that extension on the filesystem. This task won't take a long time, I guess next week I'll have some spare time to work on 👨💻 |
I thank you so much, Burak! Very kind! |
Hi @CavalloScuro , I looked for the mimeType solution but we have missed some other problem, every file ends with the name original so when the job is completed you only have one file and its the last one on the excel file since it always over writes the last file with the same name :) so I guess you should also have a column B (which is currently supported, it can contain an extension or not in your case). for example;
so if the column B contains an extension that will be used, but if not mimeType from the response will be used, does that sound good to you ? |
Dear Burak, Thank you very much for taking the time to look into this and making these adjustments to your code. I really appreciate it. I'll give it a go right now and report back later. Many thanks again. CS |
By the way the new code is not usable right now, there will be a new release and in the update notes we will have this kind of story for extension free file urls. Kind regards |
updates are handle in the pull request #54 and merged to master 👍🏼 |
I have been searching for well over a week for an application that does precisely what this wonderful application can do. However, I ran into one insurmountable obstacle: the website from which I am currently attempting to batch download images (of historical newspapers, journals, magazines, etc.) obscures the absolute path to the images in question. Thus, my image URLs conform to the following format:
http://digitale.bnc.roma.sbn.it/tecadigitale/img/giornale/TO00185283/1880/unico/00000001/original
It would be super if the developers of this application could allow for image URLs such as this one that do not feature a .jpg/.png file extension. As it currently stands, my URLs simply error out. I have tried to use other batch downloaders, and they successfully download these images, but none of them allow for filenames or folder names, which is what makes this application highly attractive.
Everything else about this application is incredible, and I commend the developers for their very good work.
Thank you for your consideration.
The text was updated successfully, but these errors were encountered: