Skip to content

Commit

Permalink
Merge pull request #16 from CDJellen/develop
Browse files Browse the repository at this point in the history
Backwards compatibility for 3.6 and 3.7
  • Loading branch information
CDJellen committed Sep 16, 2022
2 parents f0e6d94 + 11263e0 commit fc519f3
Show file tree
Hide file tree
Showing 7 changed files with 185 additions and 14 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ jobs:
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install -r requirements_dev.txt
python -m pytest
Coverage:
timeout-minutes: 30
runs-on: ubuntu-latest
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,5 @@ Brewfile.lock.json
*.pickle
*.ipynb
dist/
build/
.ipynb_checkpoints/
179 changes: 174 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,11 @@
</div>


The National Oceanic and Atmospheric Ascociation's National Data Buoy Center maintains marine monitoring and observation stations around the world[^1].
The National Oceanic and Atmospheric Association's National Data Buoy Center maintains marine monitoring and observation stations around the world[^1]. These stations report atmospheric, oceanographic, and other meterological data at regular intervals to the NDBC. Measurements are made available over HTTP through the NDBC's data service.

The ndbc-api is a python library that makes this data widely accessible.
The ndbc-api is a python library that makes this data more widely accessible.

ndbc-api is primarily built to parse whitespace-delimited oceanographic and atmospheric data distributed as text files for available time ranges, on a station-by-station basis[^2]. More information on the measreuemnts and methodology are available [on the NDBC website](https://www.ndbc.noaa.gov/docs/ndbc_web_data_guide.pdf)[^3].
The ndbc-api is primarily built to parse whitespace-delimited oceanographic and atmospheric data distributed as text files for available time ranges, on a station-by-station basis[^2]. Measurements are typically distributed as `utf-8` encoded text files, on a station-by-station, fixed-period files. More information on the measurements and methodology are available [on the NDBC website](https://www.ndbc.noaa.gov/docs/ndbc_web_data_guide.pdf)[^3].

Please read the documentation for more information:
https://ndbc-api.readthedocs.io
Expand All @@ -32,11 +32,177 @@ https://ndbc-api.readthedocs.io


#### Installation
ndbc-api can be installed via PIP:
The `ndbc-api` can be installed via PIP:

```sh
pip install ndbc-api
```

#### Requirements
The `ndbc-api` has been tested on Python 3.6, 3.7, 3.8, 3.9, and 3.10. Python 2 support is not currently planned, but could be implemented based on the needs of the atmospheric research community.

The API uses synchronous HTTP requests to compile data matching the user-supplied parameters. The `ndbc-api` package depends on:
* requests>=2.10.0
* pandas
* bs4
* html5lib>=1.1

##### Development
If you would like to contribute to the growth and maintenance of the `ndbc-api`, please feel free to open a PR with tests covering your changes. The tests leverage `pytest` and depend on the above requirements, as well as:
* coveralls
* httpretty
* pytest
* pytest-cov
* pyyaml
* pyarrow

Breaking changes will be considered, especially in the current `alpha` state of the package on `PyPi`. As the API further matures, breaking changes will only be considered with new major versions (e.g. `N.0.0`).

#### Example

The `ndbc-api` exposes public methods through the `NdbcApi` class.

```python3
from ndbc_api import NdbcApi

api = NdbcApi()
```

The `api` is a singleton, such that the underlying `RequestHandler` and NDBC station-level `RequestCache`s are shared between instances. Both the singleton metaclass and `RequestHandler` are implemented to reduce the likelihood of repeat requests to the NDBC's data service, and to converse NDBC resources. This is balanced by a station-level `cache_limit`, implemented as an LRU cache, which seeks to respect user resources.

Data made available by the NDBC falls into two broad catagories.

1. Station metadata
2. Station measurements

The `api` supports a range of public methods for accessing data from the above catagories.

##### Station metadata

The `api` has five key public methods for accessing NDBC metadata.

1. The `stations` method, which returns all NDBC stations.
2. The `nearest_staion` method, which returns the station ID of the nearest station.
3. The `station` method, which returns station metadata from a given station ID.
4. The `available_realtime` method, which returns hyperlinks and measurement names for realtime measurements captured by a given station.
5. The `available_historical` method, which returns hyperlinks and measurement names for historical measurements captured by a given station.

###### `stations`

```python3
# get all stations and some metadata as a Pandas DataFrame
stations_df = api.stations()
# parse the response as a dictionary
stations_dict = api.stations(as_df=False)
```

###### `nearest_station`

```python3
# specify desired latitude and longitude
lat = '38.88N'
lon = '76.43W'

# find the station ID of the nearest NDBC station
nearest = api.nearest_station(lat=lat, lon=lon)
print(nearest_station)
```

```python3
'tplm2'
```

###### `station`

```python3
# get staion metadata
tplm2_meta = api.station(station_id='tplm2')
# parse the response as a Pandas DataFrame
tplm2_df = api.station(station_id='tplm2', as_df=True)
```

###### `available_realtime`

```python3
# get all available realtime measurements, periods, and hyperlinks
tplm2_realtime = api.available_realtime(station_id='tplm2')
# parse the response as a Pandas DataFrame
tplm2_realtime_df = api.available_realtime(station_id='tplm2', as_df=True)
```

###### `available_historical`

```python3
# get all available historical measurements, periods, and hyperlinks
tplm2_historical = api.available_historical(station_id='tplm2')
# parse the response as a Pandas DataFrame
tplm2_historical_df = api.available_historical(station_id='tplm2', as_df=True)
```

##### Station measurements

The `api` has two public method which support accessing supported NDBC station measurements.

1. The `get_modes` method, which returns a list of supported `mode`s, coresponding to the data formats provided by the NDBC data service.

Note that not all stations provide the same set of measurements. The `available_realtime` and `available_historical` methods can be called on a station-by station basis to ensure a station has the desired data available, before building and executing requests with `get_data`.

2. The `get_data` method, which returns measurements of a given type for a given station.

###### `get_modes`

```python3
# get the list of supported meterological measurement modes
modes = api.get_modes()
print(modes)
```

```python3
[
'adcp',
'cwind',
'ocean',
'spec',
'stdmet',
'supl',
'swden',
'swdir',
'swdir2,
'swr1',
'swr2'
]
```

###### `get_data`

```python3
# get all continuous wind measurements for station tplm2
cwind_df = api.get_data(
station_id='tplm2',
mode='cwind',
start_time='2020-01-01',
end_time='2022-09-15',
)
# return data as a dictionary
cwind_dict = api.get_data(
station_id='tplm2',
mode='cwind',
start_time='2020-01-01',
end_time='2022-09-15',
as_df=False
)
# get only the wind speed measurements
wspd_df = api.get_data(
station_id='tplm2',
mode='cwind',
start_time='2020-01-01',
end_time='2022-09-15',
as_df=True,
cols=['WSPD']
)
```

#### More Information
see the [documentation](https://ndbc-api.readthedocs.io/en/latest/) for more info.


Expand All @@ -46,5 +212,8 @@ the [GitHub discussion forum](https://github.com/cdjellen/ndbc-api/discussions).


#### Contributing
TODO
The `ndbc-api` is actively maintained, please feel free to open a pull request if you have any suggested improvements, test coverage is strongly preferred.

As a reminder, breaking changes will be considered, especially in the current `alpha` state of the package on `PyPi`. As the API further matures, breaking changes will only be considered with new major versions (e.g. `N.0.0`).

Alternatively, if you have an idea for a new capability or improvement, feel free to open a feature request issue outlining your suggestion and the ways in which it will empower the atmospheric research community.
2 changes: 1 addition & 1 deletion ndbc_api/ndbc_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -450,7 +450,7 @@ def _handle_timestamp(timestamp: Union[datetime, str]) -> datetime:
return timestamp
else:
try:
return datetime.fromisoformat(str(timestamp))
return datetime.strptime(timestamp, '%Y-%m-%d')
except ValueError as e:
raise TimestampException from e

Expand Down
4 changes: 2 additions & 2 deletions tests/api/handlers/_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
PARSED_TESTS_DIR = TESTS_DATA_DIR.joinpath('api', 'parsed')
RESPONSES_TESTS_DIR = TESTS_DATA_DIR.joinpath('api', 'responses')
REQUESTS_TESTS_DIR = TESTS_DATA_DIR.joinpath('api', 'requests')
TEST_START = datetime.fromisoformat('2020-01-01')
TEST_END = datetime.fromisoformat('2022-07-15')
TEST_START = datetime.strptime('2020-01-01', '%Y-%m-%d')
TEST_END = datetime.strptime('2022-07-15', '%Y-%m-%d')


def mock_register_uri(read_requests: List[str],
Expand Down
4 changes: 2 additions & 2 deletions tests/api/requests/_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

REALTIME_START = datetime.today() - timedelta(days=30)
REALTIME_END = datetime.today()
HISTORICAL_START = datetime.fromisoformat('2020-01-01')
HISTORICAL_END = datetime.fromisoformat('2021-07-15')
HISTORICAL_START = datetime.strptime('2020-01-01', '%Y-%m-%d')
HISTORICAL_END = datetime.strptime('2021-07-15', '%Y-%m-%d')
REQUESTS_TESTS_DIR = TESTS_DATA_DIR.joinpath('api', 'requests')
BASE_URL = 'https://www.ndbc.noaa.gov/'
6 changes: 3 additions & 3 deletions tests/test_ndbc_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ def test_dump_cache_empty(ndbc_api):
data = ndbc_api.dump_cache(dest_fp=test_fp)
assert data is None
assert path.exists(str(test_fp))
test_fp.unlink(missing_ok=False)
test_fp.unlink()


@pytest.mark.usefixtures('mock_socket', 'read_responses', 'read_parsed_df')
Expand Down Expand Up @@ -114,7 +114,7 @@ def test_dump_cache_nonempty(ndbc_api):
data = ndbc_api.dump_cache(dest_fp=test_fp)
assert data is None
assert path.exists(str(test_fp))
test_fp.unlink(missing_ok=False)
test_fp.unlink()


def test_get_headers(ndbc_api):
Expand Down Expand Up @@ -302,7 +302,7 @@ def test_station_historical(ndbc_api, mock_socket, read_responses,
@pytest.mark.private
def test_handle_timestamp(ndbc_api):
test_convert_timestamp = '2020-01-01'
test_converted_timestamp = datetime.fromisoformat('2020-01-01')
test_converted_timestamp = datetime.strptime('2020-01-01', '%Y-%m-%d')
want = test_converted_timestamp
got = ndbc_api._handle_timestamp(test_convert_timestamp)
assert got == want
Expand Down

0 comments on commit fc519f3

Please sign in to comment.