[usavian] Data sources: bird ranges, distributions, other information #2

trashbirdecology · 2019-12-06T19:59:09Z

krburgio · 2019-12-07T01:33:30Z

Requested shapefiles for all North American birds through the Birdlife Internation data portal - estimated time of response is 5-10 business days. I will let you know when I hear something back!

trashbirdecology · 2019-12-07T05:46:41Z

brilliant

trashbirdecology · 2020-01-06T17:34:21Z

any news, @krburgio, from BirdLife on adapting the files?

krburgio · 2020-01-07T19:42:17Z

He never replied to my email. I will just submit a new request with the caveats that we will not be publishing their shapefiles on our larger map.

skybristol · 2020-01-07T20:12:50Z

I apologize, in advance, if I'm overloading this one issue with too much additional complexity.

Related to #11, data restrictions (as part of a license or not) are a huge part of the metadata we need to capture behind all of these sources. This is a messy world where many data providers have not gone through the process of formalizing an actual license for their product but have provided some language about what their desires are for usage. NatureServe has been a "great" example of that where they have language that is quite restrictive for information pulled from their API. It's in the language for their API key agreement, but they haven't gone about designating an actual license.

For something like BirdLife (or any third party source this system will use), we need to determine what we are going to use this source for and make sure we nail down whether or not that can be done legitimately. Simple redistribution (i.e., serving up a WMS to show visually on a map) is often something that data owners balk at because they worry about a number of things (loss of credit, inability to capture full metrics on access, problems with versions getting out of whack, etc.). But it may be okay for us to create, use, and distribute our own value-added derivative product following whatever stipulations have been specified by the owner.

If this is just a matter of pointing a management user simply looking for "raw" information products they may find of interest about particular species, then our information system just needs to have a metadata pointer that is discoverable because we connect the dots to related information. The usage mode is then someone discovering the metadata on our end and going and visiting BirdLife to get the referenced product. But I'm guessing this will be more a matter of using the corpus of BirdLife distributions with other sources to provide an interface that takes a user's point of interest (geographic area, management objectives, ecological disturbance regime, etc.) and deriving an aggregate report on species potentially of interest to the subject input vector.

The best case would be one where BirdLife (or any of our sources) is providing an interface on their end that lets us work with their data dynamically to get the answers we want. That way, we just hit their service with some code and process the response we get. But in a whole lot of cases, we likely need the ability to acquire the digital data in whatever form we can, spin them up on our own infrastructure, and provide an interface that drives our applications. That interface is a derivative product, and we just need to make sure what we need from it is consistent with whatever usage stipulations are in effect. Since we will be following a principle of openness and transparency in what we put out, we also need to make sure that we are able to provide a clear provenance trace back to the original source and through any steps we've taken to build and provide the derivatives.

trashbirdecology · 2020-01-08T01:29:53Z

@skybristol

Simple redistribution (i.e., serving up a WMS to show visually on a map)

What is WMS?

But it may be okay for us to create, use, and distribute our own value-added derivative product following whatever stipulations have been specified by the owner.

Yes, I would like to ensure that by combining existing data (e.g., BirdLife, BBS) that we can legally use the synthesized data product in our work here.

If this is just a matter of pointing a management user simply looking for "raw" information products they may find of interest about particular species, then our information system just needs to have a metadata pointer that is discoverable because we connect the dots to related information.

To clarify, the idea here would be that the end-user might benefit from having a simple base layer which contains species ranges/distributions (the synthesized product of e.g. BirdLIfe etc.) over top of the 'bulk' of the work (the conservation related things).

The best case would be one where BirdLife (or any of our sources) is providing an interface on their end that lets us work with their data dynamically to get the answers we want.

This sounds like a long-term goal

skybristol · 2020-01-08T09:59:49Z

What is WMS?

https://www.opengeospatial.org/standards/wms - Essentially putting a picture of map data on a map interface. If you poke around at BirdLife a bit, you'll see they are serving their species distributions via a Geoserver here, http://birdlaa8.miniserver.com/geoserver. That gets at your last point about it being a long term goal. If their Geoserver infrastructure is robust enough (it's on commercial cloud provider in the UK, but no idea what the machine is configured to handle), you could drive much of what you might want USAvian to do directly from the same services they have online. However, they don't advertise its existence, referring instead to a geodatabase file, which I'm guessing is their preferred distribution method.

To clarify, the idea here would be that the end-user might benefit from having a simple base layer which contains species ranges/distributions (the synthesized product of e.g. BirdLIfe etc.) over top of the 'bulk' of the work (the conservation related things).

Is your intent then to provide a visual interface that shows a user, here's your area of conservation interest and a visual depiction of the modeled distribution of species that may occur in that area? By "synthesized product" do you mean something like a species richness map based on multiple distributions? Or is it more about a report based on using the data for a calculation of some kind like species in the area, spatial area represented in the potential distribution, stats on the species (IUCN status, FWS status, etc.)?

This sounds like a long-term goal

Again, it's not necessarily a long term thing, but many groups that provide data like this are still working in the mode of, "We have a web app and stuff behind it like a Geoserver to show stuff in our context, but if you want to use our data, here's a download you can get from us with permission and associated stipulations on use." The audience considered for these cases is mostly always the individual researcher, the analytical work they are doing, and a paper they are going to publish. It's not usually for the user who is going to take all/most of the data, put them together with other data, and build some regular use application somewhere else. They may not have a problem with it, but they haven't set up the infrastructure or the legal framework to support that use.

In this case, they are essentially laying out the stipulations of CC-BY-NC or something similar and may not have a problem with what you are laying out. But I'm sure they would not want you to put copies of their geodatabase files online where someone else could find and download them, bypassing their request form and opportunity to know about who's accessing their product. (Even though anyone could figure out their Geoserver address and write code to do that now.) For our own purposes, USGS could obtain their data legitimately via the process you already set in motion, spin them up on our own Geoserver, and use that for our applications (dynamically generating species richness maps, etc.). But we would want to make sure that was completely understood and sanctioned by BirdLife and put online in a way that guided users of our derivative products back to BirdLife if they wanted to do something different, disabling the ability for our services to become a proxy for someone to bypass their system.

P.S. Sorry if I'm spouting off about a bunch of stuff you already know and are thinking about.

trashbirdecology · 2020-01-08T17:57:01Z

Is your intent then to provide a visual interface that shows a user, here's your area of conservation interest and a visual depiction of the modeled distribution of species that may occur in that area?

Yes, the primary goal is the 'conservation network' itself. By having the taxonomic distributions/filtering system, one can then visualize the network as relative to the specie(s) of interest.

By "synthesized product" do you mean something like a species richness map based on multiple distributions?

Yes, rather than being a tool which just re-maps existing products (e.g. BirdLife distributions), we could provide a simple range/distribution map.

Or is it more about a report based on using the data for a calculation of some kind like species in the area, spatial area represented in the potential distribution, stats on the species (IUCN status, FWS status, etc.)?

Generating reports, unless very general, sounds like an option for specific use-case decision support tools. If the tool proves useful for this purpose, then perhaps we will get to this point. But for now I do not envision the tool as being a single-decision DST. But, the usability tests may reveal opportunity in the area...

I'm sure they would not want you to put copies of their geodatabase files online where someone else could find and download them,

Indeed. This is certainly not an objective of USAvian. Rather we would like to use their data to create the above-defined synthesized product to help visualize the conservation network.

disabling the ability for our services to become a proxy for someone to bypass their system.

Again, yes, we do not want to provide the taxonomic geographies data products or byproducts. Rather, we will provide an otherwise non-existent conservation data layer and visualization.

P.S. Sorry if I'm spouting off about a bunch of stuff you already know and are thinking about.

This is helpful indeed -- especially the more technical data serving bits...

trashbirdecology assigned krburgio and unassigned krburgio Dec 6, 2019

trashbirdecology added the Data needed label Dec 6, 2019

trashbirdecology changed the title ~~Identify potential and get bird range, distn layers~~ Identify bird ranges and predicted distributions shapefiles Dec 6, 2019

trashbirdecology changed the title ~~Identify bird ranges and predicted distributions shapefiles~~ Data Sources: bird ranges, distributions, other information Jan 14, 2020

trashbirdecology changed the title ~~Data Sources: bird ranges, distributions, other information~~ [usavian] Data sources: bird ranges, distributions, other information Jan 14, 2020

trashbirdecology modified the milestones: Gather download links for all desired data, Data sources: identify data sources for the pilot region of choice Jan 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[usavian] Data sources: bird ranges, distributions, other information #2

[usavian] Data sources: bird ranges, distributions, other information #2

trashbirdecology commented Dec 6, 2019 •

edited

krburgio commented Dec 7, 2019

trashbirdecology commented Dec 7, 2019

trashbirdecology commented Jan 6, 2020 •

edited

krburgio commented Jan 7, 2020

skybristol commented Jan 7, 2020

trashbirdecology commented Jan 8, 2020 •

edited

skybristol commented Jan 8, 2020

trashbirdecology commented Jan 8, 2020

[usavian] Data sources: bird ranges, distributions, other information #2

[usavian] Data sources: bird ranges, distributions, other information #2

Comments

trashbirdecology commented Dec 6, 2019 • edited

CLICK ME FOR LIST OF DATA SOURCES HERE

Below is a list of A list of desired data sources for bird ranges and distributions:

Data warehouses

Taxonomy

krburgio commented Dec 7, 2019

trashbirdecology commented Dec 7, 2019

trashbirdecology commented Jan 6, 2020 • edited

krburgio commented Jan 7, 2020

skybristol commented Jan 7, 2020

trashbirdecology commented Jan 8, 2020 • edited

skybristol commented Jan 8, 2020

trashbirdecology commented Jan 8, 2020

trashbirdecology commented Dec 6, 2019 •

edited

trashbirdecology commented Jan 6, 2020 •

edited

trashbirdecology commented Jan 8, 2020 •

edited