Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create simple webserver system for testing #34

Open
iannesbitt opened this issue Oct 18, 2023 · 4 comments
Open

Create simple webserver system for testing #34

iannesbitt opened this issue Oct 18, 2023 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@iannesbitt
Copy link
Contributor

Related to:

After working with this software for a while, I'm becoming aware that there are many valid site configurations out there that we are unable to navigate due to the limitations of the spider and harvesting system.

Given the above planned features for the spider, it would improve code testing significantly to set up a simple web server with a robots.txt and sitemap.xml at the base that delivers content in some of the ways commonly used by data repositories. For example, being able to test the navigation of javascript elements that render JSON-LD content after the page is loaded (i.e. MagIC DataONEorg/member-repos#16), an application/ld+json delivery system (i.e. Harvard Dataverse DataONEorg/member-repos#52, some valid but alternative configurations of schema.org data (i.e. CanWIN DataONEorg/member-repos#67) and perhaps some misconfigured robots.txt scenarios (i.e. Borealis DataONEorg/member-repos#51), without needing to crawl the repositories themselves.

@iannesbitt iannesbitt self-assigned this Oct 18, 2023
@iannesbitt iannesbitt added v0.1.2 Version 0.1.2 item enhancement New feature or request labels Oct 18, 2023
@iannesbitt
Copy link
Contributor Author

Also potentially useful for testing #35

@iannesbitt
Copy link
Contributor Author

iannesbitt commented Oct 24, 2023

Example server tree:

├── metadata
│   ├── CANWIN.jsonld
│   ├── HAKAI_IYS.jsonld
│   ├── HD-301-response.jsonld
│   └── HD-redirect.jsonld
├── robots.txt
└── sitemap.xml

@iannesbitt
Copy link
Contributor Author

Content negotiation using Django REST framework: https://www.django-rest-framework.org/api-guide/content-negotiation/

@iannesbitt iannesbitt removed the v0.1.2 Version 0.1.2 item label Jan 10, 2024
@iannesbitt
Copy link
Contributor Author

Removing label as this is not necessarily related to a version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant