Skip to content

Commit

Permalink
Version bump to v1.0.1 (#103)
Browse files Browse the repository at this point in the history
* Fix typo in places.py. (#71) (#84)

* Added iPynb notebooks and fixed issue #67 (#86)

* Added notebooks

* Added --upgrade flag

* Update requirements and setup.py

* Added maintenance information to the readme

* Update setup.py and pypi long description (#87)

* Update setup.py and requirements

* Add long description for pypi package

* Use README.md as the long description

* Use single quote

* Fixed docs and updated changelog (#89)

* Amended documentation for creating a key.

* Updated CHANGELOG

* Spacing

* Added 0.4.3 changes

* Added 0.x to changelog

* Add get_pop_obs api (#93)

* Add get_pop_obs api

* Updated comments with more detailed description.

* Update comment and rst file

* Updated comments

* Move to populations module

* Add missing change

* Correct typo

* More typo correction

* Add comments on observations field

* Add the step to join google group (#95)

* Add more details for joining google group (#96)

* Implemented get_place_obs and reimplemented query (#94)

* Implemented get_place_obs

* Reimplemented query as a function instead of a class.

* query now returns a list. Amended docstrings.

* Fixed docstring typo

* Add observation_date as argument to get_place_obs (#97)

* Add observation_date as argument to get_place_obs

* Update doc string

* Update test

* fix link to getting started on read the docs home page (#98)

* Fixed bug regarding indexing (#100)

* Updated the changelog (#101)
  • Loading branch information
antaresc committed Oct 2, 2019
1 parent 836a8f4 commit 3349726
Show file tree
Hide file tree
Showing 29 changed files with 8,372 additions and 187 deletions.
53 changes: 53 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1 +1,54 @@
# Changelog

## 1.0.1

**Date** - 10/2/2019

**Release Tag** - [v1.0.1](https://github.com/datacommonsorg/api-python/releases/tag/v1.0.1)

**Release Status** - Current head of branch [`stable-1.x`](https://github.com/datacommonsorg/api-python/tree/stable-1.x)

New features added to the Python Client API

- Added two new functions `get_pop_obs` and `get_place_obs`
- SPARQL query is now supported as a function `query` instead of a class.
- Added documentation on how to provision an API key and provide it to the API

Bugs fixed in new release

- Fixed various typos and formatting issues in the documentation.
- If the index of the `pandas.Series` passed into functions such as `get_populations` and `get_observations` was not contiguous, then the assignment step would not properly align the values returned by calling the function. This is because the `pandas.Series` returned by the function would have a different index than the given series. This is fixed by assigning the index of the returned series to that of the given series.

## 1.0.0

**Date** - 8/9/2019

**Release Tag** - [v1.0.0](https://github.com/datacommonsorg/api-python/releases/tag/v1.0.0)

New release of the Python Client API.

- New functions in the API built on top of the [Data Commons REST API](https://github.com/datacommonsorg/mixer).
- `get_property_labels`
- `get_property_values`
- `get_triples`
- `get_populations`
- `get_observations`
- `get_places_in`
- New tests and examples checked into `datacommons/test` and `datacommons/examples`
- Full documentation released on [readthedocs](https://datacommons.readthedocs.io/en/latest/)

## 0.4.3

**Date** - 8/13/2019

**Release Tag** - [v0.4.3](https://github.com/datacommonsorg/api-python/releases/tag/v0.4.3)

**Release Status** - Latest on [PyPI](https://pypi.org/project/datacommons/). Current head of branch [`stable-0.x`](https://github.com/datacommonsorg/api-python/tree/stable-0.x).

Patch release that fixes bugs in `datacommons.Client`.

- Functions `get_cities` and `get_states` now provides `typeOf` constraints in their datalog queries.

## 0.x

Initial release of the Data Commons API.
1 change: 0 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@ RUN apt-get -q update && \

# Install python
RUN python setup.py -q install
RUN pip3 install --upgrade requests

# Run the tests
RUN ./build.sh
4 changes: 2 additions & 2 deletions datacommons/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,12 @@
# limitations under the License.

# Data Commons SPARQL query support
from datacommons.query import Query
from datacommons.query import query

# Data Commons Python Client API
from datacommons.core import get_property_labels, get_property_values, get_triples
from datacommons.places import get_places_in
from datacommons.populations import get_populations, get_observations
from datacommons.populations import get_populations, get_observations, get_pop_obs, get_place_obs

# Other utilities
from .utils import set_api_key, clean_frame, flatten_frame
2 changes: 1 addition & 1 deletion datacommons/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,7 @@ def get_property_values(dcids,

# Format the results as a Series if a Pandas Series is provided.
if isinstance(dcids, pd.Series):
return pd.Series([results[dcid] for dcid in dcids])
return pd.Series([results[dcid] for dcid in dcids], index=dcids.index)
return results


Expand Down
6 changes: 6 additions & 0 deletions datacommons/examples/populations.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@

import datacommons as dc
import pandas as pd
import pprint

import datacommons.utils as utils

Expand Down Expand Up @@ -84,5 +85,10 @@ def main():
print(pd_frame)


# Get all population and observation data of Mountain View.
utils._print_header('Get Mountain View population and observation')
popobs = dc.get_pop_obs("geoId/0649670")
pprint.pprint(popobs)

if __name__ == '__main__':
main()
7 changes: 4 additions & 3 deletions datacommons/places.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@


def get_places_in(dcids, place_type):
""" Returns :obj:`Place`'s contained in :code:`dcids` of type `place_type`.
""" Returns :obj:`Place`s contained in :code:`dcids` of type
:code:`place_type`.
Args:
dcids (Union[:obj:`list` of :obj:`str`, :obj:`pandas.Series`]): Dcids to get
Expand All @@ -55,7 +56,7 @@ def get_places_in(dcids, place_type):
Examples:
We would like to get all Counties contained in
`California <https://browser.datacommons.org/kg?dcid=geoId/06>`_. Specifying
the :code:`dcids` as a :obj:`list` resulst in the following.
the :code:`dcids` as a :obj:`list` result in the following.
>>> get_places_in(["geoId/06"], "County")
{
Expand Down Expand Up @@ -90,5 +91,5 @@ def get_places_in(dcids, place_type):
# Create the results and format it appropriately
result = utils._format_expand_payload(payload, 'place', must_exist=dcids)
if isinstance(dcids, pd.Series):
return pd.Series([result[dcid] for dcid in dcids])
return pd.Series([result[dcid] for dcid in dcids], index=dcids.index)
return result
212 changes: 210 additions & 2 deletions datacommons/populations.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ def get_populations(dcids, population_type, constraining_properties={}):
payload, 'population', must_exist=dcids)
if isinstance(dcids, pd.Series):
flattened = utils._flatten_results(result, default_value="")
return pd.Series([flattened[dcid] for dcid in dcids])
return pd.Series([flattened[dcid] for dcid in dcids], index=dcids.index)

# Drop empty results while flattening
return utils._flatten_results(result)
Expand Down Expand Up @@ -223,7 +223,7 @@ def get_observations(dcids,
payload, 'observation', must_exist=dcids)
if isinstance(dcids, pd.Series):
flattened = utils._flatten_results(result, default_value="")
series = pd.Series([flattened[dcid] for dcid in dcids])
series = pd.Series([flattened[dcid] for dcid in dcids], index=dcids.index)
return series.apply(pd.to_numeric, errors='coerce')

# Drop empty results by calling _flatten_results without default_value, then
Expand All @@ -235,3 +235,211 @@ def get_observations(dcids,
except ValueError:
typed_results[k] = v
return typed_results


def get_pop_obs(dcid):
""" Returns all :obj:`StatisticalPopulation` and :obj:`Observation` \
of a :obj:`Thing`.
Args:
dcid (:obj:`str`): Dcid of the thing.
Returns:
A :obj:`dict` of :obj:`StatisticalPopulation` and :obj:`Observation` that
are associated to the thing identified by the given :code:`dcid`. The given
dcid is linked to the returned :obj:`StatisticalPopulation`,
which are the :obj:`observedNode` of the returned :obj:`Observation`.
See example below for more detail about how the returned :obj:`dict` is
structured.
Raises:
ValueError: If the payload returned by the Data Commons REST API is
malformed.
Examples:
We would like to get all :obj:`StatisticalPopulation` and
:obj:`Observations` of
`Santa Clara <https://browser.datacommons.org/kg?dcid=geoId/06085>`_.
>>> get_pop_obs("geoId/06085")
{
'name': 'Santa Clara',
'placeType': 'County',
'populations': {
'dc/p/zzlmxxtp1el87': {
'popType': 'Household',
'numConstraints': 3,
'propertyValues': {
'householderAge': 'Years45To64',
'householderRace': 'USC_AsianAlone',
'income': 'USDollar35000To39999'
},
'observations': [
{
'marginOfError': 274,
'measuredProp': 'count',
'measuredValue': 1352,
'measurementMethod': 'CensusACS5yrSurvey',
'observationDate': '2017'
},
{
'marginOfError': 226,
'measuredProp': 'count',
'measuredValue': 1388,
'measurementMethod': 'CensusACS5yrSurvey',
'observationDate': '2013'
}
],
},
},
'observations': [
{
'meanValue': 4.1583,
'measuredProp': 'particulateMatter25',
'measurementMethod': 'CDCHealthTracking',
'observationDate': '2014-04-04',
'observedNode': 'geoId/06085'
},
{
'meanValue': 9.4461,
'measuredProp': 'particulateMatter25',
'measurementMethod': 'CDCHealthTracking',
'observationDate': '2014-03-20',
'observedNode': 'geoId/06085'
}
]
}
Notice that the return value is a multi-level :obj:`dict`. The top level
contains the following keys.
- :code:`name` and :code:`placeType` provides the name and type of the
:obj:`Place` identified by the given :code:`dcid`.
- :code:`populations` maps to a :obj:`dict` containing all
:obj:`StatisticalPopulation` that have the given :code:`dcid` as its
:obj:`location`.
- :code:`observations` maps to a :obj:`list` containing all
:obj:`Observation` that have the given :code:`dcid` as its
:obj:`observedNode`.
The :code:`populations` dictionary is keyed by the dcid of each
:obj:`StatisticalPopulation`. The mapped dictionary contains the following
keys.
- :code:`popType` which gives the population type of the
:obj:`StatisticalPopulation` identified by the key.
- :code:`numConstraints` which gives the number of constraining properties
defined for the identified :obj:`StatisticalPopulation`.
- :code:`propertyValues` which gives a :obj:`dict` mapping a constraining
property to its value for the identified :obj:`StatisticalPopulation`.
- :code:`observations` which gives a list of all :obj:`Observation`'s that
have the identified :obj:`StatisticalPopulation` as their
:obj:`observedNode`.
Each :obj:`Observation` is represented by a :code:`dict` that have the keys:
- :code:`measuredProp`: The property measured by the :obj:`Observation`.
- :code:`observationDate`: The date when the :obj:`Observation` was made.
- :code:`observationPeriod` (optional): The period over which the
:obj:`Observation` was made.
- :code:`measurementMethod` (optional): A field providing additional
information on how the :obj:`Observation` was collected.
- Additional fields that denote values measured by the :obj:`Observation`.
These may include the following: :code:`measuredValue`, :code:`meanValue`,
:code:`medianValue`, :code:`maxValue`, :code:`minValue`, :code:`sumValue`,
:code:`marginOfError`, :code:`stdError`, :code:`meanStdError`, and others.
"""
url = utils._API_ROOT + utils._API_ENDPOINTS['get_pop_obs'] + '?dcid={}'.format(dcid)
return utils._send_request(url, compress=True, post=False)

def get_place_obs(place_type, observation_date, population_type, constraining_properties={}):
""" Returns all :obj:`Observation`'s for all places given the place type,
observation date and the :obj:`StatisticalPopulation` constraints.
Args:
place_type (:obj:`str`): The type of places to query
:obj:`StatisticalPopulation`'s and :obj:`Observation`'s for.
observation_date (:obj:`str`): The observation date in ISO-8601 format.
population_type (:obj:`str`): The population type of the
:obj:`StatisticalPopulation`
constraining_properties (:obj:`map` from :obj:`str` to :obj:`str`, optional):
A map from constraining property to the value that the
:obj:`StatisticalPopulation` should be constrained by.
Returns:
A list of dictionaries, with each dictionary containng *all*
:obj:`Observation`'s of a place that conform to the :obj:`StatisticalPopulation`
constraints. See examples for more details on how the format of the
return value is structured.
Raises:
ValueError: If the payload is malformed.
Examples:
We would like to get all :obj:`StatisticalPopulation` and
:obj:`Observations` for all places of type :obj:`City` in year 2017 where
the populations have a population type of :obj:`Person` is specified by the
following constraining properties.
- Persons should have `age <https://browser.datacommons.org/kg?dcid=age>`_
with value `Years5To17 <https://browser.datacommons.org/kg?dcid=Years5To17>`_
- Persons should have `placeOfBirth <https://browser.datacommons.org/kg?dcid=placeOfBirth>`_
with value BornInOtherStateInTheUnitedStates.
>>> props = {
... 'age': 'Years5To17',
... 'placeOfBirth': 'BornInOtherStateInTheUnitedStates'
... }
>>> get_place_obs('City', '2017', Person', constraining_properties=props)
[
{
'name': 'Marcus Hook borough',
'place': 'geoId/4247344',
'populations': {
'dc/p/pq6frs32sfvk': {
'observations': [
{
'marginOfError': 39,
'measuredProp': 'count',
'measuredValue': 67,
'type': 'Observation'
},
# More observations...
],
}
}
},
# Entries for more cities...
]
The value returned by :code:`get_place_obs` is a :obj:`list` of
:obj:`dict`'s. Each dictionary corresponds to a :obj:`StatisticalPopulation`
matching the given :code:`population_type` and
:code:`constraining_properties` for a single place of the given
:code:`place_type`. The dictionary contains the following keys.
- :code:`name`: The name of the place being described.
- :code:`place`: The dcid associated with the place being described.
- :code:`populations`: A :obj:`dict` mapping :code:`StatisticalPopulation`
dcids to a a :obj:`dict` with a list of :code:`observations`.
Each :obj:`Observation` is represented by a :obj:`dict` with the following
keys.
- :code:`measuredProp`: The property measured by the :obj:`Observation`.
- :code:`measurementMethod` (optional): A field identifying how the
:obj:`Observation` was made
- Additional fields that denote values measured by the :obj:`Observation`.
These may include the following: :code:`measuredValue`, :code:`meanValue`,
:code:`medianValue`, :code:`maxValue`, :code:`minValue`, :code:`sumValue`,
:code:`marginOfError`, :code:`stdError`, :code:`meanStdError`, and others.
"""
# Create the json payload and send it to the REST API.
pv = [{'property': k, 'value': v} for k, v in constraining_properties.items()]
url = utils._API_ROOT + utils._API_ENDPOINTS['get_place_obs']
payload = utils._send_request(url, req_json={
'place_type': place_type,
'observation_date': observation_date,
'population_type': population_type,
'pvs': pv,
}, compress=True)
return payload['places']

0 comments on commit 3349726

Please sign in to comment.