Creating a Snapshot API is down #21

aylusltd · 2020-08-03T04:01:55Z

https://archive.readme.io/docs/creating-a-snapshot

Example returns 502 bad gateway error. Has been down for several days.

mekarpeles · 2020-08-03T19:15:41Z

Hi @aylusltd thank you for the heads up. I should have time to debug this weekend (we've had a lot going on)

In the meantime you may wish to use the official Wayback SPN 2.0 (save page now)
https://help.archive.org/hc/en-us/articles/360001513491-Save-Pages-in-the-Wayback-Machine#:~:text=Install%20the%20Wayback%20Machine%20Chrome,give%20you%20a%20permanent%20URL.

aylusltd · 2020-08-03T23:34:08Z

@mekarpeles , I can't actually use the extension. We use the API form a lambda to get back the permanent url and queue it up for human annotation.

I guess we could try to automate that, but seems like it'd be easier for us just to read the extension code and match its API invocation.

mekarpeles · 2020-08-04T17:22:13Z

Should be back up, I think :)
The annotations for this labs is fairly experimental (and not maintained by IA). If anyone is relying on this for mission critical purposes (i.e. you'd be upset if annotations db disappeared) I suggest running this service locally or submitting a PR for a simple docker setup.

This code doesn't require any privileged setup to run and installation is pretty simple:
https://github.com/ArchiveLabs/pragma.archivelab.org#installation

mekarpeles · 2020-08-04T17:22:21Z

#21 (comment)

jerclarke · 2020-08-05T15:50:32Z

I just tried the example from https://archive.readme.io/docs/creating-a-snapshot and got a 502 error.

Maybe it's just the example that's the problem, but figured I'd mention it.

What I really want is the old system where a call to http://web.archive.org/save/http://google.com would create a cache of the URL, and return a Content-Location header with the /web/* path of the archive, but I think I'm probably in the wrong place for that. That "trick" stopped working July 10 (bug report on the AmberLink project).

mekarpeles · 2020-08-05T19:12:20Z

If https://pragma.archivelab.org ever throws a 502, that means the service is down. Maybe it's getting DDOS'd? I just restarted the service and it looks like it's currently working.

Sincere apologies for playing close/open tag on this issue! @jerclarke or @aylusltd -- please feel free to close if this resolves your issue :) I'll defer to you two and be on standby if I can help further.

Towito · 2021-04-15T21:32:21Z

Hi, I've been trying to use this API and it appears to be down. The GET snapshot API appears to be fine, but POST requests do not appear to be working. I'm hoping to go through multiple web pages on a somewhat regular basis, so the SPN function is not particularly useful to me. Additionally, the size of webpages is far too small to justify an Archive-IT solution. Has this service been deprecated or something?

mackuba · 2021-04-28T18:48:03Z

Yeah, not working for me either…

SemjonWilke · 2021-10-06T07:51:46Z

Still 502

mekarpeles · 2021-10-07T00:40:30Z

Here's the deal --

pragma.archivelab.org was intended to be a system for saving Wayback snapshots with annotations attached. Very few snapshots have had what I'd consider descriptive, meaningful, or specific annotations attached. Leading me to believe most people just want the use the Wayback API (of which there is one).

The problem here is, I now have a database of 17,000,000 urls that y'all have archived and postgres can't keep up.

So, the course of action I'm planning is that this code be changed to just proxy using the Wayback API and not write to the database at all, unless an annotation is added.

Given the current performance characteristics and use -- which I had not intended / planned on, given that I wrote this while I was a volunteer not even working for the Internet Archive -- I have to now consider what to do with all of this data...

mekarpeles · 2021-10-07T06:22:51Z

Ok, I added this capability:
https://github.com/ArchiveLabs/pragma.archivelab.org/blob/master/README.md#performing-a-capture

curl -X POST -H "Content-Type: application/json" -d '{"url": "https://google.com"}}' https://pragma.archivelab.org/capture

Pandaklez · 2022-06-21T09:56:26Z

It still throws 502 error :(

ronthepennyhoarder · 2022-06-21T19:50:40Z

Getting a 502 error as well

aylusltd mentioned this issue Aug 3, 2020

Create snapshot returning 502 ArchiveLabs/pragma.archivelab.org#10

Open

mekarpeles closed this as completed Aug 4, 2020

jerclarke mentioned this issue Aug 5, 2020

Internet Archive API seems to be broken as of Jul 10, 2020: All URLs getting marked as down berkmancenter/amber_wordpress#59

Open

mekarpeles reopened this Oct 7, 2021

mekarpeles closed this as completed Oct 7, 2021

mekarpeles reopened this Jun 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating a Snapshot API is down #21

Creating a Snapshot API is down #21

aylusltd commented Aug 3, 2020

mekarpeles commented Aug 3, 2020

aylusltd commented Aug 3, 2020

mekarpeles commented Aug 4, 2020

mekarpeles commented Aug 4, 2020

jerclarke commented Aug 5, 2020 •

edited

Loading

mekarpeles commented Aug 5, 2020

Towito commented Apr 15, 2021

mackuba commented Apr 28, 2021

SemjonWilke commented Oct 6, 2021

mekarpeles commented Oct 7, 2021

mekarpeles commented Oct 7, 2021

Pandaklez commented Jun 21, 2022

ronthepennyhoarder commented Jun 21, 2022

Creating a Snapshot API is down #21

Creating a Snapshot API is down #21

Comments

aylusltd commented Aug 3, 2020

mekarpeles commented Aug 3, 2020

aylusltd commented Aug 3, 2020

mekarpeles commented Aug 4, 2020

mekarpeles commented Aug 4, 2020

jerclarke commented Aug 5, 2020 • edited Loading

mekarpeles commented Aug 5, 2020

Towito commented Apr 15, 2021

mackuba commented Apr 28, 2021

SemjonWilke commented Oct 6, 2021

mekarpeles commented Oct 7, 2021

mekarpeles commented Oct 7, 2021

Pandaklez commented Jun 21, 2022

ronthepennyhoarder commented Jun 21, 2022

jerclarke commented Aug 5, 2020 •

edited

Loading