Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Facebook image archiving #41

Closed
djhmateer opened this issue Jun 16, 2022 · 3 comments
Closed

Facebook image archiving #41

djhmateer opened this issue Jun 16, 2022 · 3 comments
Labels
archiver enhancement New feature or request

Comments

@djhmateer
Copy link
Contributor

djhmateer commented Jun 16, 2022

Archiving of image(s) on Facebook is not supported yet and would be very useful.

Placeholder Issue to put in ideas of potentially how it could be done.

Background

https://github.com/djhmateer/auto-archiver#archive-logic has a list of what works and doesn't. Facebook video works using youtube_dlp

In fork above to get a Facebook screenshot I am using using automation to click on the accept cookies page as we don't want the cookie popup in the screenshot.

To get a Facebook post link

"Each Facebook post has a timestamp on the top (it may be something like Just now, 3 mins or Yesterday). This timestamp contains the link to your post. So, to copy it, simply hover your mouse over the timestamp, right click, then copy link address"

Example

As an example of Facebook images which we would like to archive:

https://www.facebook.com/chelseymateerbeautician/posts/pfbid0mhimrwfeBpWKwBUFna28Q3RfaEK8HETcEpk1QXoEeFXHVwaa7oxLxKTHbBqu5nPpl

https://gist.github.com/pcardune/1332911 - potentially this may help.

#26 - @msramalho talked about the potential of https://archive.ph/

@msramalho
Copy link
Contributor

Hi Dave, thanks for opening this discussion.

From #26 we should not use archive.ph since it has captchas and will stop working rather quickly.

My fear is that automation with selenium will lead to the same result as facebook is quite aggressive on that, nonetheless that's the only option still on the table (unless some potentially hacky library is around). If you want to give it a go you can try using the webdriver and using the clicks to get to the content but this would need some experiments to see how quickly facebook detects the automated behaviour and blocks the page. Tbh, this has been the reason we have not tried automating facebook posts :/

@loganwilliams loganwilliams changed the title Facebook Images Facebook image archiving May 11, 2023
@loganwilliams loganwilliams added enhancement New feature or request archiver labels May 11, 2023
@djhmateer
Copy link
Contributor Author

I've made a fb archiver on my fork of this codebase: https://github.com/djhmateer/auto-archiver/blob/main/auto_archive_fb.py

Have been running in production well (with caveats!).. it's pretty specialised and has to run on its own server.

Maybe close this as an issue, as it is working well enough for me.

@msramalho
Copy link
Contributor

Closing this issue for now since the caveats with facebook archiving usually require passive accounts and proxies. the wacz_archiver_enricher can be used in combination with facebook login credentials to achieve this result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
archiver enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants