👍🎉 First off, thanks for taking the time to contribute! 🎉👍
We very much welcome new contributions through analysis notebooks or new tests for the database!
This project and everyone participating in it is governed by the Code of Conduct. By participating, you are expected to uphold this code.
This section guides you through submitting a bug report for this repository. Following these guidelines helps maintainers and the community understand your report, reproduce the behavior, and find related reports.
Before creating bug reports, please check existing issues and pull requests as you might find out that you don't need to create one. When you are creating a bug report, please include as many details as possible.
Note: If you find a Closed issue that seems like it is the same thing that you're experiencing, open a new issue and include a link to the original issue in the body of your new one.
New contributions are added to this repository through Github pull requests. Make a fork of this repository to your own Github account and then create a new pull request with the necessary changes.
Please make sure that you
- Document new code based on the Documentation Styleguide
- End all python files with a newline and follow the PEP8, e.g. by using flake8
The automated testing framework uses pytest
to run to run thousands of tests, one for every single
timeseries and meta information (e.g. valid temperatures, valid ages, etc.). The rather technical
setup of this test suite is outsourced to the conftest.py script, the
configuration script for pytest
. If you simply want to add a new test to the framework, just add a
function to the test_data.py script and make sure it starts with test_
. Then,
pytest
will recognize it automatically and run it for every timeseries.
pytest
uses so-called fixtures to collect the tests and provide these as input argument to the
test function. For example, if you would, for example, add a test to check that every LiPD file has a
DOI in the publications, you can define a function such as
def test_has_publication(lipd_data):
for pub in lipd_data['pub']:
assert 'doi' in pub
This function will then be ran for every single LiPD file that has a temperature series in it.
lipd_data
thereby is the python dictionary that is obtained from the
lipd.readLipd function. This works for any of
the following fixtures that are defined in conftest.py:
lipd_file
: The path to a LiPD filelipd_data
: The dictionary of a LiPD file as obtained from the lipd.readLipd function. This fixture is only provided for LiPD files that contain temperature series.series_data
: One individual temperature series dictionary as obtained from the lipd.extractTs functionpd_series
: One individualseries_data
as a pandas series. The index is theage
of the sample, the values are the corresponding temperatures.series_data_country
: The same asseries_data
but with an additonalgeo_natEarth
country that corresponds to the country inferred from the NaturalEarth shapefile.
The test suite generates an excel report that contains some important information for every time series (
e.g. minT
in the test_temperature_values
function. If you want to add new information here, use the
record_property
fixture in the test function.
Following our above example, let's not only test for the doi in the publications, but also extract the number
of publications for each LiPD file. This information should appear in the nPublications
column of the
Excel file. We can do so by adding one single line: record_property('nPublications', len(lipd_data['pub']))
to our test function above and adding the record_property
fixture to the function arguments
def test_has_publication(lipd_data, record_property):
record_property('nPublications', len(lipd_data['pub']))
for pub in lipd_data['pub']:
assert 'doi' in pub
Furthermore, to provide a bit more documentation in the Excel file, we recommend to
- Add a docstring what the test is doing
- Use the
record_property_name
fixture to document what thenPublications
mean
The final test function than looks like
def test_has_publication(
lipd_data, record_property, record_property_name):
"""Test the number of publications and their DOI"""
record_property_name(nPublications="Number of Publications in the LiPD file")
record_property('nPublications', len(lipd_data['pub']))
for pub in lipd_data['pub']:
assert 'doi' in pub
We want to make the Temperature12K database more accessible so we very much welcome contributions to play around with the data!
New notebooks should be created in the notebooks directory and should be self-explanatory, i.e. make sure you add enough comments to let the user follow what the notebook is doing. If you need further packages to run the analysis, you should also add them to the environment.yml file. This makes sure that everyone can run your awesome analysis for free on mybinder.org.
More information about jupyter notebooks can be found at https://jupyter.org.
- Use the present tense ("Add feature" not "Added feature")
- Use the imperative mood ("Move cursor to..." not "Moves cursor to...")
- Limit the first line (summary) to 72 characters or less
- Reference issues and pull requests liberally after the first line
- Follow the numpy documentation guidelines.
- Use reStructuredText in the function documentation.