-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scraper only collects 118 byte files #49
Comments
Same issue here #48. I'm using https://github.com/mikf/gallery-dl which is working fine |
One thing I noticed was that the sub-domain returns 403: but using url like so: returns the image without problem. I'm no programmer but when I have some free time I may try and refactor atleast one of the modules to support that change and see what happens. |
@Pr0j3ct what do you mean? I put a print statement into the script to see what it was trying to download. What printed out matched what I got when manually going to the gallery page, selecting and image and then inspecting it. |
The API has definitely changed.
Digging through the gallery-dl project I can see that they’re using a
different API call
It’s essentially /api/3.0/
Whereas the current version of this project uses /api/2.0/
…On Thu, Jul 25, 2024 at 10:21 AM Project ***@***.***> wrote:
One thing I noticed was that the sub-domain returns 403:
i.vsco.co
but using url like so:
vsco.co/i
returns the image without problem.
I'm no programmer but when I have some free time I may try and refactor
atleast one of the modules to support that change and see what happens.
—
Reply to this email directly, view it on GitHub
<#49 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AXYLG6DADYKKHKEGHF52CCTZOEJXRAVCNFSM6AAAAABLIMKXR6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJQGY2TSNZZGA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Edit: Seems like they block the default request header which is used by the script. You could simply set a custom header to your requests to get the images.
Alternatively you could use cloudscraper instead of the python requests. pip install cloudscraper
|
That works perfectly, thank you! |
Co
Could someone please explain how to do this? Would like to get this working again. I've tried gallery-dl but prefer vscoscraper. |
I´ve already explained how to do this. |
I can see where to replace the txt in the constants.py file. But I'm not sure where to add the txt to the vscoscrpae.py file. I've tried adding at the end but i get an error message when I run the script Cheers |
nothing to replace in constants.py, just add images dict |
Hey, so I am not a programmer in the least, the first two files you are referring to constants.py and vscoscrape.py, where are those located? and where are those new entries supposed to be in the files you mention? Of course any help is sincerely appreciated! Edit: so when I look through the git for vsco-scraper I see the two files you are talking about, I am not sure what I am supposed to do with those files. I installed vsco-scraper with pip, so in this case do I need to edit the source and perform a build/compile or something along those lines? Forgive me, I only know that the vsco-scraper is in the bin folder off of my linux profile, after that I have zero ideas on what to do... =( |
if you installed vscoscrape with pip the files are located in your python installation. No need to build from source. Just use the pip package and do the following.
Now open vscoscrape.py and search for download_img_normal
|
Thank you very much!! Those changes were easy enough, first attempt gave me an indentation error, I just needed to move the Edit: I tested if for journals, it produces the 118k files, I tried to sort it out, the block for journals is very different... Edit: I figured it out, I looked for the function for downloading journals, and added "headers=constants.images" to the jpg and mp4 lines and it worked like a charm! I'm certainly not a python programmer now...lol but reading through your code, I see that constants.images must refer to the constants.py file and the .images must refer to the images entry that you had me add! Thanks for helping me see it! =) |
thanks vm @timbo0o1, i know there is gallery-dl but it doesnt keep the same original filename and for updating an old folder it was ass |
hey, thanks for the previous help. unfortunately the script doesn't work again. i tried to run it, but it shows '... crashed' for every usernames in my txt file. please take a look... thank you |
Maybe take a look in here #50 |
Approx 2 weeks ago the scraper only started collecting 118 byte files.
Does not appear to be IP address related. Has the VSCO API changed?
The text was updated successfully, but these errors were encountered: