DO UI analysis #11885

nobuti · 2017-03-29T08:02:05Z

This PR implements #11802, #11621 and #11619.

Things to do:

More tests
Column name for every measurement
Get rid of move cursor for measurement item
Test behaviour deeply

How to test

We can add/remove multiple measurements.
Every measurement is working independently.
Check the new columns are generated properly after an analysis is executed. Also, the new column name is a combination of measurement, denominator and timespan selected values.
Check the quota limits is working properly for multiple measurements. The rule is each analysis will consume n measurements * y rows.
Check that we can edit the analysis once saved.
Check the form's validation. Only measurement field is required.
Check the save button is working properly:
- A user can't save an invalid form
- Once saved, the button is disabled until the form changes.
This PR introduces a new component for this analysis. The first dropdown allows filtering on the frontend side showing up to 100 items to select. We should check the performance is good.
Check that tracking css classes are generated properly.

Updated references. Updated specs.

It grows when license is shown, it needs to be top aligned.

ethervoid · 2017-06-15T10:18:39Z

I'm working on the NULL issue right now

xavijam · 2017-06-15T11:54:11Z

app/assets/stylesheets/editor-3/entry.scss

@@ -4,6 +4,7 @@
 @import '../cartoassets/entry';
 @import '../colorpicker/bootstrap-colorpicker/bootstrap-colorpicker';

+@import 'vars';


saleiva · 2017-06-15T12:17:03Z

The thing is that then we will lose access to the analysis that works in favour of an analysis that doesn't work. To be it sounds too early, but @kevin-reilly to decide

xavijam · 2017-06-15T12:26:08Z

Agree @saleiva, we should wait until we have the NULLs issue fixed 👍 . Go go go go!

ethervoid · 2017-06-15T12:55:33Z

I've created a PR that fixes the reported NULL problem (returning NULL could be driven by multiple causes. In this case was that we're creating the "boundaries" of the dataset and we're including the NULL values and that seems to cause a misbehavior that leads the DO to think that it's dealing with polygons.

In any case, any use of DO with polygon datasets is could not be possible, we need to know if we're able to interpolate the measure with the polygons. If you want to know more about this, please ask me.

saleiva · 2017-06-15T16:20:03Z

About the polygons part... @ethervoid , can you elaborate? Not sure I follow

ethervoid · 2017-06-15T16:28:15Z

Sure, DO could interpolate the measure value using polygons but we have some especial cases where we can't (you can't see it here).

So in some cases (I don't fully understand how DO data works yet), we're not able to return a value so we return NULL. Talking with @xavijam both of us agree in giving the user some kind of feedback in this cases or filter that measures directly (we have to previously analyze the dataset). I've added a note in the roadmap to take it into consideration.

Hope I've explained 🤞

kevin-reilly · 2017-06-15T21:54:04Z

Did the staging version revert to the production version somehow? Am I missing something?

I'm seeing this somehow:

xavijam · 2017-06-16T05:39:46Z

Sorry @kevin-reilly, something happened with the new feature flag and it was not enabled. It is enabled again. Apologies again :(.

xavijam · 2017-06-16T08:13:50Z

There is something I've suffered several times (and reported by @kevin-reilly too), but not sure how to manage it. Steps:

Create a new DO analysis, with a measurement A, timestamp B.
Analysis will generate a column called A_B (for example).
After finishing it successfully, the user styles the layer by that column (obtaining a choropleth map 💃 ).
Turns out, he/she wants to change that DO analysis, and selects another timestamp, in this case, C.
So, the analysis will generate another column, in this case, will be called A_C.
When the user applies the new analysis it will fail because the Map instantiation will fail too, style will be referencing to a non-existant column A_B.

What should we do in this case @saleiva @kevin-reilly? Not sure if we should reset the styles of the analysis applied.

ethervoid · 2017-06-16T10:12:42Z

Fix for the NULL issue reported by @kevin-reilly deployed

kevin-reilly · 2017-06-16T18:37:33Z

Regarding @xavijam comment about formatting, ideally if A_B is used for formatting that formatting should be applied to A_C when the Analysis is changed.

If that is not possible, if A_B no longer exists because of the change to A_C, than the formatting should return to the default.

How is this handled outside of Analysis? For example, if I have a column called Sales that I use for formatting, what happens if I drop that column using SQL?

kevin-reilly · 2017-06-16T21:00:15Z

Couple more things I'm seeing after testing today. All tests using this map and the lightrailstations dataset unless otherwise noted.

Normalizing

Using the Hispanic Population 2015 dataset:

If I do not select Normalize, I cannot select a boundary; the selector defaults to "No values." The analysis completes with different values for most points but with some null values.

If I do select Normalize > Population, the selector allows me to choose a boundary, but the Analysis returns a null for each point. I tried this with both US Census Blocks and Shoreline clipped Elementary School Districts.

If I do select Normalize > Per sq km, I cannot select a boundary; the selector defaults to "No values" and the Analysis returns a null for each point.

Boundaries

These should be called "Geographic Units"
If there is both a Shoreline clipped and a non-shoreline clipped geography, we should only present the option for Shoreline clipped and remove the "Shoreline clipped" prefix from the description. This is confusing
How are these sorted? It should be largest to smallest but I'm not clear it works that way.

Timespans

The Timespan options are a bit strange for many datasets. Looking at Hispanic Population again, I see:

2015-2015
2014-2014
2011-2015
2010-2013
2010-2010
2006-2010

The single year timespans should just show "2015" and not 2015-2015, for example.

Region

Using the lightrailstation dataset which is only in the US, the Region selector only presents that as an option when I click the pulldown. This is correct, but if there is only one selection, it should populate by default and not make me select it.

Using the Irish Counties dataset (which is only in Ireland...), when I select the Region menu, it only offers me "United Kingdom." I believe this should also offer me EU datasets as well.

I created a dataset called Dummy which only includes two points, one in Texas and one in Ireland. When I try to create an Analysis for this data set, the Region menu offers me United States, Canada, Spain, and European Union but not United Kingdom. This seems wrong.

jorgesancha · 2017-06-19T08:30:03Z

Hey @kevin-reilly

I'm not seeing the same results that you are seeing regards normalization. I was able to select boundaries with that dataset and I always got results back, except with the first case where there were sometimes null values.

Not normalized

Normalized

Normalized > Per Square Km

Regarding everything else, while I agree with pretty much all of your feedback, I don't think we should use this ticket to solve what are fundamentally bad decisions on the design of the DO architecture or inconsistencies in the data. Bear in mind that what we show in the UI ("Shoreline clipped", etc... or the actual Timespans) is what we get from the DO API, and if we want to change something, we should change it there. Making adjustments on the client to cater for these things is not the right way to go.

Regarding "Boundaries" vs "Geographic Units", I would think twice before changing it; or if we do, we should probably change it everywhere else (in the DO, I mean). So far, everyone has referred to this as boundaries, so is quite a significant change.

If there are no more client side issues to fix, I would suggest we put this in production under feature flag and open it to team, and use this interface to:

Better explore the DO data and potential inconsistencies
Come up with
- a) a list of issues we want to prioritise on the DO (whether changes to the DO or the actual Data) before we announce it
- b) any potential changes to the UI to make things clear or better based on the feedback we get

What do you think? Should we get together to discuss briefly this afternoon?

cc/ @juanignaciosl @saleiva

xavijam · 2017-06-19T08:54:37Z

If that is not possible, if A_B no longer exists because of the change to A_C, than the formatting should return to the default.

Right now we don't have any logic when an analysis has changed and a "conditional" style was set. But we have it when an analysis is removed, trying to keep the previous style.

How is this handled outside of Analysis? For example, if I have a column called Sales that I use for formatting, what happens if I drop that column using SQL?

If the user styles the layer by a column called "example":

and drops that column, the map instantiation will fail (error will appear), and will need to change the style in order to visualize it again.
and change the SQL only selecting, the cartodb_id, the_geom, the_geom_webmercator, it will "reset" the style and apply it the default one.

saleiva · 2017-06-19T09:08:03Z

Hi guys, I might be arriving late to the party (:tada:) but remember we ideally should show the default time-span as selected on the input. If not, it looks like a needed step.

juanignaciosl · 2017-06-19T09:35:39Z

I agree with jsancha: +1000 to feature-flag this and open issues with fixes and improvements.

ethervoid · 2017-06-19T11:52:02Z

@saleiva right now, AFAIK, we don't have a way from DO that defines a time span as the default for a numerator measurement. We could add a new column for it in the metadata part of the database that points the last timespan as the default one but the closest we're is to order them and pick the first of it.

The problem with this approach is that we have only strings for the time spans so it's not 100% reliable. I've added a new issue that would make us possible to filter/sort the time spans.

kevin-reilly · 2017-06-19T12:43:43Z

I'm ok with opening this up to more internal users via feature flag. I know there are people looking forward to testing this so that will help expose any other issues.

I am a bit concerned that @jorgesancha was not able to reproduce my normalization issue. I checked this exact thing a number of times on Friday to make sure I wasn't doing something wrong. What would cause us to see different results?

I agree absolutely that Builder should present all datasets made available by the API. The API is the correct place to address my comment about the 'shoreline' clipped datasets. We should put that in the backlog.

ethervoid · 2017-06-19T12:49:48Z

@kevin-reilly we're currently working on the clipped issue here

Regarding the normalization issue, it's weird and concerning so we should take a look to it and open an issue if we're able to have some steps to reproduce.

nobuti · 2017-06-20T09:40:51Z

Guys, this PR has grown too much, it's unmanageable, so we are creating single issues for every fix we need to make in order to improve the DO analysis. So, I'm merging this into the data-observatory-analysis-ui and opening PR from there.

nobuti added 30 commits March 22, 2017 07:59

Moved data observatory collections to data.

1d27478

Updated references. Updated specs.

Regions collection.

8c51c2d

Integrated regions and measurements.

75de42e

Normalize collection.

b928882

Merge branch 'master' into 11802-do-requests

6e2949d

Dropdown DO made to work with nested form.

32332cc

Made form list required param more flexible.

b96bae9

List of measurements.

1d7ca02

Nested form to render only fields for changes in schema.

bd59b52

Fix tests.

9d2943e

Commit changes to parent form.

a95a893

Integrate timespan.

22aeff3

Fine tune locales.

4e3d6b2

Slider placeholder when empty labels.

3864291

Boundaries.

aae9bb7

Fine tune styles for nested form.

779249e

Add and remove measurement styles.

76985c9

Update quota transform for data observatory.

6db3a02

Avoid undefined id for legend custom items.

e0ac51a

Fix #11825

4107e51

Added license to dropdown.

0f3b00a

Align top for measurements.

721e8e1

It grows when license is shown, it needs to be top aligned.

Validation.

435b9ef

Fix selected item when fetched if region changes.

de5d047

Fix tests.

ce46739

Fix analysis quota specs to reflect the tranformation for DO.

2c230a3

Fixes.

1b887d8

Avoid parse JSON again.

d88d9f7

Show num measurements in region dropdown.

4ed6786

Fine tune queries to deal with null values.

2ad7de2

xavijam reviewed Jun 15, 2017

View reviewed changes

nobuti added 4 commits June 16, 2017 08:57

Let mustache handle the query param as well.

9dcc2aa

Fix uncompressed files.

fd93d45

Merge branch 'master' into 11802-do-requests

1544812

Lint.

da3f091

nobuti added 2 commits June 16, 2017 15:51

Upgrade boundaries query to support timespans.

1e29a9d

Bump.

bf130ec

ethervoid mentioned this pull request Jun 19, 2017

Sort the OBS_GetAvailableGeometries results CartoDB/observatory-extension#295

Closed

nobuti merged commit a7d88f9 into data-observatory-analysis-ui Jun 20, 2017

nobuti deleted the 11802-do-requests branch June 20, 2017 09:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DO UI analysis #11885

DO UI analysis #11885

nobuti commented Mar 29, 2017 •

edited

ethervoid commented Jun 15, 2017

xavijam Jun 15, 2017

saleiva commented Jun 15, 2017

xavijam commented Jun 15, 2017

ethervoid commented Jun 15, 2017

saleiva commented Jun 15, 2017

ethervoid commented Jun 15, 2017

kevin-reilly commented Jun 15, 2017

xavijam commented Jun 16, 2017

xavijam commented Jun 16, 2017 •

edited

ethervoid commented Jun 16, 2017

kevin-reilly commented Jun 16, 2017

kevin-reilly commented Jun 16, 2017

jorgesancha commented Jun 19, 2017

xavijam commented Jun 19, 2017

saleiva commented Jun 19, 2017

juanignaciosl commented Jun 19, 2017

ethervoid commented Jun 19, 2017

kevin-reilly commented Jun 19, 2017

ethervoid commented Jun 19, 2017 •

edited

nobuti commented Jun 20, 2017

DO UI analysis #11885

DO UI analysis #11885

Conversation

nobuti commented Mar 29, 2017 • edited

How to test

ethervoid commented Jun 15, 2017

xavijam Jun 15, 2017

Choose a reason for hiding this comment

saleiva commented Jun 15, 2017

xavijam commented Jun 15, 2017

ethervoid commented Jun 15, 2017

saleiva commented Jun 15, 2017

ethervoid commented Jun 15, 2017

kevin-reilly commented Jun 15, 2017

xavijam commented Jun 16, 2017

xavijam commented Jun 16, 2017 • edited

ethervoid commented Jun 16, 2017

kevin-reilly commented Jun 16, 2017

kevin-reilly commented Jun 16, 2017

Normalizing

Boundaries

Timespans

Region

jorgesancha commented Jun 19, 2017

xavijam commented Jun 19, 2017

saleiva commented Jun 19, 2017

juanignaciosl commented Jun 19, 2017

ethervoid commented Jun 19, 2017

kevin-reilly commented Jun 19, 2017

ethervoid commented Jun 19, 2017 • edited

nobuti commented Jun 20, 2017

nobuti commented Mar 29, 2017 •

edited

xavijam commented Jun 16, 2017 •

edited

ethervoid commented Jun 19, 2017 •

edited