-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
static legacy files that potentially need to be served #205
Comments
The first three are just files for page styling. Those should no longer be necessary. The file /dwc/text/tdwg_dwc_text.xsd is the XML Schema file for Darwin Core archives - essential. This now lives at /standards/documents/text. The .xsd files in /dwc/xsd/ are the XML Schema files for Darwin Core as XML - still used by some. The file /dwc/rdf/dwctermshistory.rdf is the previous normative Darwin Core. The normative document is now /vocabulary/term_versions.csv. The file /dwc/rdf/dwcterms.rdf was an extract of normative Darwin Core with only the currently recommended terms in it. This is deprecated. The three files /dwc/tdwg_dw_geospatial.xsd, /dwc/tdwg_dw_curatorial.xsd and /dwc/tdwg_dw_core.xsd are XML Schema files for Darwin Core in DiGIR. This is not part of the Darwin Core standard, but there remain some DiGIR servers still in operation. The files /dwc/tdwg_dw_record.xsd and /dwc/tdwg_dw_record_tapir.xsd are XML Schema files for Darwin Core in TAPIR. This is not part of the Darwin Core standard, but there remain some TAPIR servers still in operation as well. All the references to /dwc/. look redundant and shouldn't exist. The two ,xml files in /dwc/examples/text are examples of what Darwin Core Archive XML should look like. The content at /dwc/xsd/simpledarwincore/ is now at /standard/documents/simple. /dwc/trackback/ - no idea. |
OK, after some effort, I've gone through @MattBlissett 's not-404 list and eliminated all of the stuff that I don't think should produce a 200. Here is what I have left:
Here are some notes about items on this list:
So the bottom line is that I think all of the redirects are probably working right in my script and if any of the .xsd files would become available from Github pages, the problem would correct itself without needing to change the script. So @MattBlissett if you want to load the restxq.xqm file from https://raw.githubusercontent.com/tdwg/rs.tdwg.org/master/html/restxq.xqm into your labs.gbif.org BaseX server instance, we can see how it behaves there. |
Given that we have not had any reports of issues in the last two years, I suspect that this issue can be closed. Opinions @peterdesmet @baskaufs ? |
Agree to close. |
I think that this issue has been fixed and as far as I know everything is getting redirected to the right place with no complaints. There is only one technical issue that should probably be considered before closing this. Refer to this section of code and beyond, which handles the legacy redirects. When I first set up these redirects, I used 301 redirects for the URLs that really shouldn't be used any more. However, I think it was @timrobertson100 who noted that 301 (moved permanently) redirects were hard to undo and that at least for the time, 307s (moved temporarily) would be better. So I changed all 301s to 307s. Should some or all of the 307s be changed to 301's now that things seem stable? |
Is there really functionally any difference? It seems like the 307 is a
nice option for those with commitment issues. ;-)
…On Sat, Sep 5, 2020 at 5:42 PM Steve Baskauf ***@***.***> wrote:
I think that this issue has been fixed and as far as I know everything is
getting redirected to the right place with no complaints.
There is only one technical issue that should probably be considered
before closing this. Refer to this
<https://github.com/tdwg/rs.tdwg.org/blob/master/html/restxq.xqm#L474>
section of code and beyond, which handles the legacy redirects. When I
first set up these redirects, I used 301 redirects for the URLs that really
shouldn't be used any more. However, I think it was @timrobertson100
<https://github.com/timrobertson100> who noted that 301 (moved
permanently) redirects were hard to undo and that at least for the time,
307s (moved temporarily) would be better. So I changed all 301s to 307s.
Should some or all of the 307s be changed to 301's now that things seem
stable?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#205 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADQ72ZODJ6IAGSQT25IER3SEKPE7ANCNFSM4F26CKQA>
.
|
It probably doesn't make any difference, but I'm not expert enough on these matters to say for sure. I think that if you use a 301, then Google will stop indexing the old URL and just use the new one. So that may be significant. |
Maybe we can get an opinion from @timrobertson100 and/or @MattBlissett and then move this issue toward closure. |
301 is probably OK now, although it does introduce a risk in case of future changes to whatever code this is implemented in. If it's complicated, I would leave it as 307s. If it's straightforward and would be difficult to accidentally mess up (e.g. a list of those URLs) then 301s would be safe. |
Sounds like either route is Ok. Who would actually make the changes? If whoever has the time to make the change to 301s, let's do it. If not, let me know and we'll officially stick with 307s and close the issue. |
There actually are not a lot of 301s and they are mostly to handle specific URLs, like redirecting the pattern
which doesn't work any more (and never will) to
which does work. The redirects for patterns of URLs are generally not 301s. Since I think there is consensus to make the changes, I will just do it as part of the next release of the rs.tdwg.org repo and close this issue. |
@MattBlissett had given me a link to the vhosts redirect file, which now is no longer available online. However, I had been working on setting up redirects to basically every static file found at https://github.com/tdwg/dwc/tree/gh-pages .
Perhaps a better way to get at this is to look at the list of the requests to the server that did not result in 404s (i.e. documents actually delivered). This is a better indication of what users are actually asking for. The numbers in the first column indicate the number of requests (requests for DwC term and guide dereferencing are omitted).
As you can see there are some files that seem to be pretty important as they were requested hundreds or thousands of times. @tucotuco would have better insight about what these files might be being used for, and what would be likely to break if we no longer provided them.
ping @peterdesmet
The text was updated successfully, but these errors were encountered: