Skip to content

assorted scripts related to species trait data

Notifications You must be signed in to change notification settings

CALeDNA/trait-scripts

 
 

Repository files navigation

trait-scripts

assorted scripts related to species trait data

All data is from the USDA PLANTS database accessed through the advanced search. usda_plants_detailed_with_synonyms.csv is the original data. Many species are represented by multiple rows in that file because they have multiple synonymous scientific names or symbols, and there are also duplicates. In binomial_to_symbol.csv and symbol_to_data.csv, the data is combined into a relational format.

Rows were grouped into connected components by depth-first search, where rows A and B are connected if any of the following is true

  • A['Accepted Symbol'] == B['Accepted Symbol']
  • A['Synonym Symbol'] == B['Accepted Symbol']
  • A['Scientific Name'] == B['Scientific Name'],

meaning that A and B represent the same species. For each species, data from all rows was merged to fill in as many columns as possible, and a symbol assigned to that species.

binomial_to_symbol.csv maps Latin binomials to species symbols. symbol_to_data.csv maps species symbols to the merged data for that species.

About

assorted scripts related to species trait data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 89.4%
  • Python 10.6%