1989 guessing of namedplaces #2809

rafatower · 2015-03-17T18:45:50Z

There's the namedplaces guessing, taking into account countries

@Kartones this is pretty much ready for a "final" review, can you please take another look?

Cartofante · 2015-03-17T18:49:53Z

Frontend tests were OK 👍 (details)

javisantana · 2015-03-18T08:40:37Z

why "this does not seem to scale well" ? what do you mean?

rafatower · 2015-03-18T08:56:22Z

services/importer/lib/importer/content_guesser.rb

+
+      def count_namedplaces(sample, column_name_sym)
+        sql_array = sample.map{|row| "'" + row[column_name_sym] + "'"}.join(',')
+        query = "WITH geo_function as (SELECT (geocode_namedplace(Array[#{sql_array}])).*) select count(success) FROM geo_function where success = TRUE"


Replying @javisantana this is the critical part. this is gonna be called through sql api once per column with an array of up to sample size.

Due to the nature of the sampling and the dataset itself, I do not expect having a good hit rate in cache for these queries, nor for the geocodings on the full datasets.

If you're ok with it, then me too and more than happy of releasing it :)

ok, so why don't we add a metric there? also, if you feel like the call is going to be really expensive we have the option of having a different geocode_namedplace that gets not only a single array but an array per column. Are we doing this only for text columns, right?

+1 to add a metric and check whether performance is a problem or not

Yes, we're just querying text columns.

About sending one single query with all text columns of the sample: do you think it will be more efficient for the general case?

it would be more efficient because you only need to open a connection (at least this).
Let's add that metric and see how it looks

…medplaces

Cartofante · 2015-03-25T16:55:45Z

Frontend tests were OK 👍 (details)

Cartofante · 2015-03-25T18:02:45Z

Frontend tests were OK 👍 (details)

rafatower · 2015-03-25T18:02:54Z

services/importer/lib/importer/georeferencer.rb

@@ -199,12 +233,13 @@ def geocode(formatter, geometry_type, kind)
            geometry_type: geometry_type,
            kind: kind,
            max_rows: nil,
-            country_column: nil
+            country_column: country_column_name,
+            countries: "'#{country}'"


note for myself: this should be nil if country is nil

…medplaces

Cartofante · 2015-03-25T18:23:02Z

Frontend tests were OK 👍 (details)

Cartofante · 2015-03-25T18:42:02Z

Frontend tests were OK 👍 (details)

Cartofante · 2015-03-26T13:44:30Z

Frontend tests were OK 👍 (details)

Cartofante · 2015-03-26T13:53:38Z

Frontend tests were OK 👍 (details)

Cartofante · 2015-03-26T14:23:31Z

Frontend tests were OK 👍 (details)

Kartones · 2015-03-26T16:11:20Z

👍 Nice tests!

1989 guessing of namedplaces

Rafa de la Torre added 2 commits March 17, 2015 19:17

First version of guessing of namedplaces #1989

bb868d5

Fix for the query generator #1989

e8c7e86

rafatower reviewed Mar 18, 2015
View reviewed changes

Rafa de la Torre added 2 commits March 24, 2015 16:33

Merge remote-tracking branch 'origin/master' into 1989-guessing-of-na…

72cb358

…medplaces

Revamp of namedplaces guessing #1989

1ddb456

Rafa de la Torre added 2 commits March 25, 2015 18:18

Fix call to postgres function #1989

7c189f9

Fix params when geocoding from guess #1989

2677efa

rafatower reviewed Mar 25, 2015
View reviewed changes

Rafa de la Torre added 2 commits March 25, 2015 19:14

Fix small bug in geocode param #1989

741a6dd

Merge remote-tracking branch 'origin/master' into 1989-guessing-of-na…

63423ff

…medplaces

Rafa de la Torre added 4 commits March 26, 2015 10:16

Move NamedplacesGuesser to a separate file #1989

532b14e

Some tests for NamedplacesGuesser #1989

5680442

remove unused var #1989

1640731

More tests for NamedplacesGuesser #1989

fafc058

Uncomment log line #1989

7931876

rafatower changed the title ~~1989 guessing of namedplaces [do NOT merge]~~ 1989 guessing of namedplaces Mar 26, 2015

Fix silly typo with singulars/plurals #1989

f985d52

rafatower pushed a commit that referenced this pull request Mar 30, 2015

Merge pull request #2809 from CartoDB/1989-guessing-of-namedplaces

ac248f4

1989 guessing of namedplaces

rafatower merged commit ac248f4 into master Mar 30, 2015

rafatower deleted the 1989-guessing-of-namedplaces branch March 30, 2015 12:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1989 guessing of namedplaces #2809

1989 guessing of namedplaces #2809

rafatower commented Mar 17, 2015

Cartofante commented Mar 17, 2015

javisantana commented Mar 18, 2015

rafatower Mar 18, 2015

javisantana Mar 18, 2015

rafatower Mar 18, 2015

javisantana Mar 18, 2015

Cartofante commented Mar 25, 2015

Cartofante commented Mar 25, 2015

rafatower Mar 25, 2015

Cartofante commented Mar 25, 2015

Cartofante commented Mar 25, 2015

Cartofante commented Mar 26, 2015

Cartofante commented Mar 26, 2015

Cartofante commented Mar 26, 2015

Kartones commented Mar 26, 2015

1989 guessing of namedplaces #2809

1989 guessing of namedplaces #2809

Conversation

rafatower commented Mar 17, 2015

Cartofante commented Mar 17, 2015

javisantana commented Mar 18, 2015

rafatower Mar 18, 2015

Choose a reason for hiding this comment

javisantana Mar 18, 2015

Choose a reason for hiding this comment

rafatower Mar 18, 2015

Choose a reason for hiding this comment

javisantana Mar 18, 2015

Choose a reason for hiding this comment

Cartofante commented Mar 25, 2015

Cartofante commented Mar 25, 2015

rafatower Mar 25, 2015

Choose a reason for hiding this comment

Cartofante commented Mar 25, 2015

Cartofante commented Mar 25, 2015

Cartofante commented Mar 26, 2015

Cartofante commented Mar 26, 2015

Cartofante commented Mar 26, 2015

Kartones commented Mar 26, 2015