Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isochrones & Isodistances #957

Merged
merged 54 commits into from
Oct 2, 2019
Merged

Isochrones & Isodistances #957

merged 54 commits into from
Oct 2, 2019

Conversation

jgoizueta
Copy link
Contributor

@jgoizueta jgoizueta commented Sep 5, 2019

Fixes #889

Interface:

There are two methods, isochrones and isodistances; both take a dataset argument, an array of range values (seconds for isochrones, meters of isodistances) and a number of optional arguments.

from cartoframes.analysis import IsoAnalysis  # will be from data.services ...
iso = IsoAnalysis()

# dry run 
iso.isochrones(dataset, [100,1000], mode='car', dry_run=True) # => {'required_credits': N}

# compute isochrones and return as a dataframe dataset
iso.isochrones(dataset, [100,1000], mode='car') # => new_ds

# compute isochrones and return as a table dataset
iso.isochrones(dataset, [100,1000], mode='car', table_name='isolines') # => new_ds

cartoframes/analysis/iso.py Outdated Show resolved Hide resolved
cartoframes/analysis/iso.py Outdated Show resolved Hide resolved
cartoframes/analysis/iso.py Outdated Show resolved Hide resolved
cartoframes/analysis/iso.py Outdated Show resolved Hide resolved
cartoframes/analysis/iso.py Outdated Show resolved Hide resolved
cartoframes/analysis/iso.py Outdated Show resolved Hide resolved
cartoframes/analysis/iso.py Outdated Show resolved Hide resolved
cartoframes/analysis/iso.py Outdated Show resolved Hide resolved
cartoframes/analysis/iso.py Outdated Show resolved Hide resolved
cartoframes/analysis/iso.py Outdated Show resolved Hide resolved
@jgoizueta jgoizueta changed the base branch from develop to feature/888-geocode September 5, 2019 16:30
@jgoizueta
Copy link
Contributor Author

Problems so far:

  • To avoid losing partial results we should use _exception_safe functions. But then we don't know if a problem occurred (only that we got less rows than expected).
  • If no result table is desired this returns the results directly as a dataframe dataset; but to do so we cannot use batch queries, so we're subject to the usual timeout limits.

TODO:

  • Add some results metadata (?) for result tables this will require additional query.

cartoframes/analysis/iso.py Outdated Show resolved Hide resolved
cartoframes/analysis/iso.py Outdated Show resolved Hide resolved
cartoframes/analysis/iso.py Outdated Show resolved Hide resolved
cartoframes/data/services/isolines.py Outdated Show resolved Hide resolved
cartoframes/data/services/isolines.py Outdated Show resolved Hide resolved
cartoframes/data/services/isolines.py Outdated Show resolved Hide resolved
cartoframes/data/services/isolines.py Outdated Show resolved Hide resolved
@elenatorro
Copy link
Contributor

Hi! 👋 I've a question regarding the API naming:

We're importing the 'Isolines' class, and then we're creating from this instance an Isochrone. That's because the Isoline can be an Isochrone or an Isodistance instance.

Would be possible to have a common Isoline Class, and then two classes (Isochrone and Isodistance) that inherit from this class? So we can do something like:

from cartoframes.services.isolines import Isochrone, Isodistance

isochrone = Isochrone(dataset, [100, 1000], mode='car')
isodistance = Isodistance(dataset, [100, 1000], mode='car')

This is just a suggestion, I don't know what's the correct way. I'm just thinking about using the same convention we're using in the rest of the classes. What do you think?

@makella
Copy link
Contributor

makella commented Sep 16, 2019

hi! I have some questions too!

  • The term 'isolines'
    • could we consider the class name to be "Isopleth" instead since "pleth" is the part that refers to areas (similar to choropleth for polygons) when I think of isolines, I think of something like contour lines connecting equal elevations vs. drive or walk time polygons. Just a suggestion for us to think about.
  • From your demo, I couldn't tell if there is a parameter for dissolved or in tact boundaries between areas.
    • For example, if you input 20 minute walk time
      • would you get four polygons in 5 min increments, or 1 20 minute polygon?
      • is this user defined?
      • I'm guessing this is a parameter to define and may have missed that part!
  • In the demo, you were generating two time-based polygons for each point: 10 minutes and 1 hour
    • how can we ensure that the 10 minute polygon will draw on top of the 1 hour polygon given that we can't set order for polygons in VL?
    • similarly, in the 20 minute case above, how can we ensure that the incremental polygons draw in the proper order?
    • my main concern here is that if we don't provide a way to do it and/or don't have that already by default, that users will get stuck when they try and visualize the times/distances in something like a choropleth map. afterall, we don't have an order parameter for polygons so mainly wondering if we could do it at the data level and if that would ensure the correct draw order.
  • Similarly, when the result is visualized on the map, we can't really "see" the different time levels because of ordering and symbology.
    • this will likely require more discussion, but when we output the result, a really nice feature would be to have the result styled. I know this is out of scope, but just want to point it out. The suggestion above is also a solution here.
  • I know the angular geometries are because of the provider, but what is the likelihood of us being able to give better looking polygons with another provider like TomTom? is this something the user can set to overwrite the default provider?

@jgoizueta
Copy link
Contributor Author

I'm glad you brought out the topic of the naming, cause I'm not happy either with the current isolines (which I used because we use it in the docs. I used Iso at some point but that seemed too ambiguous.
I'll be happy to go either with Isopleth or having two classes as @elenatorro suggests. Actually I think we could do both and name the base class Isopleth.

@jgoizueta
Copy link
Contributor Author

From your demo, I couldn't tell if there is a parameter for dissolved or in tact boundaries between areas.

@mamata: for the API follows closely the SQL (data services) interface. So you specify an array of range values (times for isochrones/distances for isodistances) and get one boundary for each value and input point which encloses all the area within (less than) that time/distance (so we don't generate rings, but simple polygons).

For example here we have two input points (A, B, green) and two range values, 5 and 15 and we get four resulting isochrone polygons without holes:

IMG_20190917_104451747

We have a somewhat higher level interface in camshaft (builder analyses for trade areas, buffers). The difference is that the number of ranges and max range is passed instead of individual ranges values (so ranges are max/n, ..., max) and that there's a dissolved option where all the isochrones for each range are merged into one (instead of having separate isochrones for each center point). In the previous example we would have two multi polygons, one with boundaries A-5, B-5 and other with A-15, B-15.

my main concern here is that if we don't provide a way to do it and/or don't have that already by default, that users will get stuck when they try and visualize the times/distances in something like a choropleth map. afterall, we don't have an order parameter for polygons so mainly wondering if we could do it at the data level and if that would ensure the correct draw order.

Now, when having multiple ranges we have the problem of larger range value polygons hiding the smaller values, since the larger value ones always completely cover all the smaller ones. As you have noticed this is going to be a real problem for the users. What can we do about it?

  • Ordering the results by increasing range order. Camshaft actually does the opposite. The problem is how to force this order in the visualization. I thought that there was and ordering expression in CARTO-VL (which could still have artifacts at tile borders), but I might be wrong.
  • Using transparency.
  • Computing rings for areas between ranges (i.e. subtract smaller ranges from larger ones).
  • Generating separate dataset results for each range, so they have to be added as separate layers of a map and the ordering can be controlled.

The problem with the two first is that I don't see how we could automate that styling, and I think that just advising the users how to style isochrones result wouldn't be enough.

I like the idea of generating rings, but that can be computationally expensive, and not always what the user needs. We could add options for that, but we'd need to make rings the default to avoid the visualization problem for unsuspecting users.

And I also would like to generate separate datasets for each range value, but I'm afraid the resulting API would be too complex, what do you think?

I know the angular geometries are because of the provider, but what is the likelihood of us being able to give better looking polygons with another provider like TomTom? is this something the user can set to overwrite the default provider?

I don't see how we could handle on the API part, since in the end the provider and quality inconsequence will depend on how much 💰 the user pays.

@jgoizueta
Copy link
Contributor Author

🤔 To solve the overlapping ranges problem maybe we could have a helper method to add the ischrones result to a map which could add a layer for each range (filtering the result dataset by range value)

@makella
Copy link
Contributor

makella commented Sep 17, 2019

@jgoizueta

Regarding the naming, yes, I like @elenatorro's suggestion as well and whatever you feel is the best class name. Obviously aligning with current doc makes sense but even in the doc, we say Isoline but then talk about generating polygons... I think that is the most confusing part for me! For example, the first part in the doc says lines, then polygon, then area..

Isolines are contoured lines that display equally calculated levels over a given surface area. This enables you to view polygon dimensions by forward or reverse measurements. Isoline functions are calculated as the intersection of areas from the origin point, measured by distance (isodistance) or time (isochrone).

for the API follows closely the SQL (data services) interface. So you specify an array of range values (times for isochrones/distances for isodistances) and get one boundary for each value and input point which encloses all the area within (less than) that time/distance (so we don't generate rings, but simple polygons).

Ahhhh, ok, I was, in my head, thinking that this was more similar to the way that we do it in Builder. Before I write more, let me try out this implementation to "see" what the result is to understand what output we'll get vs. the Builder one that I am familiar with and have in my mind. Then I'll be able to comment better on the other pieces.

thanks!!

@makella
Copy link
Contributor

makella commented Sep 19, 2019

@jgoizueta as discussed today, we can share the rationale for proposed naming with the group on Monday and also I will work on some viz ideas for the default output and we can discuss the visualization challenge with the group as well

test/data/services/test_isolines.py Outdated Show resolved Hide resolved
test/data/services/test_isolines.py Outdated Show resolved Hide resolved
test/data/services/test_isolines.py Outdated Show resolved Hide resolved
test/data/services/test_isolines.py Outdated Show resolved Hide resolved
test/data/services/test_isolines.py Outdated Show resolved Hide resolved
test/data/observatory/repository/test_provider_repo.py Outdated Show resolved Hide resolved
test/data/observatory/repository/test_geography_repo.py Outdated Show resolved Hide resolved
test/data/observatory/repository/test_dataset_repo.py Outdated Show resolved Hide resolved
test/data/observatory/repository/test_country_repo.py Outdated Show resolved Hide resolved
test/data/observatory/repository/test_category_repo.py Outdated Show resolved Hide resolved
@jgoizueta jgoizueta changed the base branch from feature/888-geocode to develop September 20, 2019 18:31
@jgoizueta
Copy link
Contributor Author

jgoizueta commented Oct 1, 2019

While completing the docs, I've notice a couple of things:

  • I intended to replace the custom Result classes by namedtuple but forgot about it: I'll make this change.
  • cartodb_id on output should be removed if not present in input as done for geocoding
  • the with_source_id parameter could be removed, and source_id be produced only if the input has a cartodb_id column.

cartoframes/data/services/routing.py Outdated Show resolved Hide resolved
cartoframes/data/services/routing.py Outdated Show resolved Hide resolved
cartoframes/data/services/routing.py Outdated Show resolved Hide resolved
cartoframes/data/services/routing.py Outdated Show resolved Hide resolved
cartoframes/data/services/routing.py Outdated Show resolved Hide resolved
cartoframes/data/services/routing.py Outdated Show resolved Hide resolved
cartoframes/data/services/routing.py Outdated Show resolved Hide resolved
cartoframes/data/services/routing.py Outdated Show resolved Hide resolved
cartoframes/data/services/routing.py Outdated Show resolved Hide resolved
cartoframes/data/services/routing.py Outdated Show resolved Hide resolved
test/data/services/test_routing.py Outdated Show resolved Hide resolved
cartoframes/utils/utils.py Show resolved Hide resolved
Also, cartodb_id is removed from output if it wasn't in the source
# Conflicts:
#	cartoframes/data/dataset/registry/dataframe_dataset.py
test/data/services/test_geocode.py Show resolved Hide resolved
test/data/services/test_geocode.py Show resolved Hide resolved
test/data/services/test_geocode.py Show resolved Hide resolved
test/data/services/test_geocode.py Show resolved Hide resolved
test/data/services/test_geocode.py Show resolved Hide resolved
cartoframes/data/services/service.py Outdated Show resolved Hide resolved
cartoframes/data/services/service.py Outdated Show resolved Hide resolved
test/data/services/test_isolines.py Show resolved Hide resolved
cartoframes/data/services/isolines.py Outdated Show resolved Hide resolved
cartoframes/data/services/isolines.py Outdated Show resolved Hide resolved
cartoframes/data/services/isolines.py Outdated Show resolved Hide resolved
cartoframes/data/services/isolines.py Outdated Show resolved Hide resolved
cartoframes/data/services/isolines.py Outdated Show resolved Hide resolved
cartoframes/data/services/isolines.py Outdated Show resolved Hide resolved
cartoframes/data/services/isolines.py Outdated Show resolved Hide resolved
cartoframes/data/services/isolines.py Outdated Show resolved Hide resolved
cartoframes/data/services/isolines.py Outdated Show resolved Hide resolved
Copy link
Contributor

@simon-contreras-deel simon-contreras-deel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added some comments about credentails-Dataset and a possible error. I am not sure if we should avoid doing lines longer than 120 even in comments.

In any case, it looks awesome.

cartoframes/data/services/isolines.py Outdated Show resolved Hide resolved
super(Isolines, self).__init__(credentials, quota_service=QUOTA_SERVICE)

def isochrones(self, source, range, **args):
"""isochrone areas
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool doc 💪

cartoframes/data/services/isolines.py Outdated Show resolved Hide resolved
cartoframes/data/services/isolines.py Outdated Show resolved Hide resolved
cartoframes/utils/utils.py Show resolved Hide resolved
test/data/services/test_isolines.py Outdated Show resolved Hide resolved
Credentials have already been set in the Dataset constructor
Dataframe datasets ignore the constructor credentials,
and, in any case, for dataframes we're setting the Geocoder
credentials to upload them.
Generate proper legend
Copy link
Contributor

@simon-contreras-deel simon-contreras-deel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 💪

@jgoizueta jgoizueta merged commit d43b466 into develop Oct 2, 2019
@jgoizueta jgoizueta deleted the feature/889-isolines branch October 2, 2019 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Easy way to create isolines
6 participants