Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add link from page details view to Internet Archive #196

Closed
Mr0grog opened this issue Jan 30, 2018 · 7 comments
Closed

Add link from page details view to Internet Archive #196

Mr0grog opened this issue Jan 30, 2018 · 7 comments

Comments

@Mr0grog
Copy link
Member

Mr0grog commented Jan 30, 2018

(Updated 2019-06-17)

Where possible, we should link to the original source of pages/versions from the Wayback Machine. Analysts typically include Wayback links and screenshots in their reports (partially because our tools aren’t public-access, but also because Wayback is a pretty good system of public reference in general), so having the links directly in our views would make that a lot easier for them.

We currently show links to Versionista diffs in the top area of the page details view:

versionista-diff-link

(This is the VersionistaInfo component.)

It would be great to replace that with a link to:

  • Definitely: Each of the versions being compared (if those versions are from Wayback and not some other source).
    You can find the correct link in the source_metadata.view_url property of a link. See an API response like this for an example: https://api.monitoring.envirodatagov.org/api/v0/pages/1829df77-b37e-49dc-998b-0fc964e7f24e/versions

  • Optionally: A link to the listing/calendar view of all the snapshots of the given page in Wayback. To create that link, we’ll have to compose it from the data we have available. It always looks like https://web.archive.org/web/*/<page_url>, e.g. http://web.archive.org/web/*/https://www.fhwa.dot.gov/about/staff.cfm

    This could be a simple link right next to the original URL:

    view-in-wayback-link


As I’m sitting here watching the report writing training, I’m wondering if it would be a good idea to add a link to the Internet Archive history/calendar view for a page, e.g. http://web.archive.org/web/*/https://www.fema.gov/flood-insurance-reform-mapping-flood-hazards It seems like there’s typically a lot of cross-referencing/checking with Internet Archive when building a report.

This might just be getting ahead of ourselves. Once we start pulling in data directly from Internet Archive, would it still be useful?

/cc @trinberg

@stale
Copy link

stale bot commented Jan 27, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in seven days if no further activity occurs. If it should not be closed, please comment! Thank you for your contributions.

@stale stale bot added the stale label Jan 27, 2019
@Mr0grog
Copy link
Member Author

Mr0grog commented Jan 28, 2019

We use Internet Archive snapshots in a lot of reports and analysis, so this is still an incredibly useful (and not too hard to implement!) feature — we might even consider it for all versions, even when they weren’t originally sourced from the Internet Archive.

@stale stale bot removed the stale label Jan 28, 2019
@Mr0grog Mr0grog added this to Ready in Web Monitoring May 23, 2019
@Mr0grog Mr0grog changed the title Add link from page details view to Internet Archive? Add link from page details view to Internet Archive Jun 17, 2019
@SYU15
Copy link
Contributor

SYU15 commented Jul 7, 2019

We use Internet Archive snapshots in a lot of reports and analysis, so this is still an incredibly useful (and not too hard to implement!) feature — we might even consider it for all versions, even when they weren’t originally sourced from the Internet Archive.

Do you still feel we should add these links even to the pages that were sourced from Versionista? I am starting to look into this.

@Mr0grog
Copy link
Member Author

Mr0grog commented Jul 10, 2019

Yes! More:

For the version-level link (the one that comes from source_metadata.view_url), it’s only really applicable if the source was Wayback, but that’s a version-by-version question rather than a page-level question.

For the page-level link (e.g. https://web.archive.org/web/*/https://climate.nasa.gov/), I think we still always want them — analysts use these to see what additional data Wayback might have about a page, even if we haven’t imported any captures from Wayback.

@SYU15
Copy link
Contributor

SYU15 commented Jul 13, 2019

Ok, sounds good @Mr0grog. It seems to me that there are 2 urls being passed into the view, one for each page version that's being compared (props.to & props.from). Sometimes both have source_metadata.view_url from Wayback, other times one (or both, possibly, though I haven't run into this this case) is from Versionista and therefore doesn't have that field.

Am I supposed to only render source_metadata.view_url from the latest page version or both of them if they exist?

@Mr0grog
Copy link
Member Author

Mr0grog commented Jul 14, 2019

Am I supposed to only render source_metadata.view_url from the latest page version or both of them if they exist?

For both of them if they exist.

Sometimes both have source_metadata.view_url from Wayback, other times one (or both, possibly, though I haven't run into this this case) is from Versionista and therefore doesn't have that field.

Yep, this is by design. source_metadata has a standard schema based on source_type, so you should expect different data for a versionsta version than an internet_archive one, but you should expect that same kinds of data for any two versions with the same source_type. It’s basically all the additional data we could get from the source that might be unique to that source or that doesn’t fit the standard page/version model.

SYU15 added a commit to SYU15/web-monitoring-ui that referenced this issue Jul 14, 2019
SYU15 added a commit to SYU15/web-monitoring-ui that referenced this issue Jul 14, 2019
SYU15 added a commit to SYU15/web-monitoring-ui that referenced this issue Jul 14, 2019
SYU15 added a commit to SYU15/web-monitoring-ui that referenced this issue Jul 19, 2019
SYU15 added a commit to SYU15/web-monitoring-ui that referenced this issue Jul 26, 2019
SYU15 added a commit to SYU15/web-monitoring-ui that referenced this issue Jul 26, 2019
Mr0grog pushed a commit that referenced this issue Jul 29, 2019
@Mr0grog
Copy link
Member Author

Mr0grog commented Aug 1, 2019

Fixed in #381. We forgot to make sure the commit comment was formatted correctly to auto-close this!

@Mr0grog Mr0grog closed this as completed Aug 1, 2019
Web Monitoring automation moved this from Ready to Done! Aug 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Web Monitoring
  
Done!
Development

No branches or pull requests

2 participants