![Wherobots logo](../assets/img/header-logo.png)

# Working with Foursquare Places data in Wherobots

This notebook introduces the [Foursquare Places table](https://cloud.wherobots.com/spatial-catalog?redirect=/spatial-catalog&catalogId=2rk2zjbg7pl6f8lb7xkzv&database=foursquare&table=places) in the Wherobots Open Data catalog. We will create a live, interactive map like that shows the count of coffee shops in San Francisco aggregated by neighborhoods.

<img src="./assets/img/choropleth.png" alt="drawing" width="400"/>

In [None]:
from sedona.spark import *

config = SedonaContext.builder().getOrCreate()
sedona = SedonaContext.create(config)

# Connecting

First, let's list the tables in the `wherobots_open_data.foursquare` namespace.

In [None]:
sedona.sql("SHOW tables IN wherobots_open_data.foursquare").show(truncate=False)

We can use `printSchema` to see the columns in both these tables, and you can find the full documentation of [the schema on the Foursquare site](https://docs.foursquare.com/data-products/docs/places-os-data-schema).

In [None]:
sedona.table("wherobots_open_data.foursquare.places").printSchema()

In [None]:
sedona.table("wherobots_open_data.foursquare.categories").printSchema()

# Counting records

Let's see how many records exist total and with non-null entries for some core columns.

In [None]:
places_df = sedona.table("wherobots_open_data.foursquare.places")
places_df.createOrReplaceTempView("places") 

In [None]:
sedona.sql("""
SELECT 
    COUNT(*) as total_places,
    COUNT(CASE WHEN name IS NOT NULL THEN 1 END) as has_name,
    COUNT(CASE WHEN address IS NOT NULL THEN 1 END) as has_address,
    COUNT(CASE WHEN fsq_category_labels IS NOT NULL THEN 1 END) as has_categories,
    COUNT(CASE WHEN date_closed IS NULL THEN 1 END) as still_open
FROM places
""").show()

There are more than 100 million places in this dataset. About two-thirds of them have addresses associated with them, and most of them (over 90 million) are assigned to a category.

# Selecting specific columns and places that are not closed

We can select the colummns we could use to find coffee shops that are currently open: Foursquare place ID, name, geometry, category labels, country, date refreshed, and date created. This query excludes places that are closed or have a null value for name, geom, or country.

In [None]:
places_df = sedona.sql("""
    SELECT 
        fsq_place_id,
        name,
        geom,
        fsq_category_labels,
        country,
        date_refreshed,
        date_created
    FROM places
    WHERE 
        date_closed IS NULL 
        AND name IS NOT NULL
        AND geom IS NOT NULL
        AND country IS NOT NULL
""")

places_df.createOrReplaceTempView("places")

In [None]:
places_df.show(10)

# Geographical subsets

We can use regular SQL to limit our dataset to the United States.

In [None]:
us_df = sedona.sql("""
SELECT *
FROM places
WHERE country = 'US'
""")

us_df.createOrReplaceTempView("us_places")

In [None]:
us_df.show(10)

We can also filter to a region or polygon that is not part of the dataset by defining an area of interest and doing a spatial join. We'll start with a WKT (well-known text) representation of the boundaries of San Francisco.

In [None]:
SF = "MULTIPOLYGON (((-122.4773830027518 37.8110279985324, -122.47638299914844 37.810827999509335, -122.4759830013289 37.809727998200174, -122.4716250028432 37.80888699769438, -122.47028300451245 37.808627998875025, -122.46779299919203 37.80680899858057, -122.46768300135908 37.8067279974676, -122.46575100100827 37.80568799892819, -122.46378300203135 37.80462799989076, -122.4542820032217 37.80632799891723, -122.45348200072374 37.80652799950531, -122.44828200078767 37.80732800140436, -122.4470820015151 37.807428001332454, -122.4460820014932 37.80762799774489, -122.44528200371676 37.80762799953789, -122.44378200243108 37.80782799805509, -122.44378200251181 37.80766799854344, -122.4437820034566 37.807527999295765, -122.44738200200705 37.80632799796006, -122.44708199997392 37.80542799865116, -122.4425820018721 37.806027999575576, -122.44268200256738 37.806527999340844, -122.44388200245132 37.80642800005626, -122.44269200115872 37.80676800008563, -122.44248200427785 37.80682799876855, -122.44288200531544 37.808227999998806, -122.44018200276149 37.808627999916574, -122.43988200218001 37.8085279999752, -122.44018200208507 37.808127998825874, -122.44238200240515 37.80782799780524, -122.44218200297193 37.80682800062341, -122.43578200185702 37.80762799795391, -122.43528200153574 37.80682799978457, -122.4342820009474 37.806827997751036, -122.43348200087478 37.80562800130018, -122.4325820027682 37.80562799922645, -122.43218200240449 37.8061279996238, -122.43238200360143 37.80862800149156, -122.43148200301285 37.809227999915386, -122.43048200217724 37.80922800085042, -122.42998200010045 37.80922800091288, -122.42988200246609 37.808627999082745, -122.42808200017859 37.80842800136175, -122.42708200218428 37.80822799900991, -122.42765300181587 37.80875500071409, -122.42800299984292 37.808787997316905, -122.42799300130218 37.80891399918604, -122.42755800062385 37.808854999693345, -122.4267829995374 37.80841900067282, -122.42669600145317 37.808334998727894, -122.4266820019633 37.80832799757732, -122.42678200221529 37.809627997738794, -122.42666100185696 37.80998099838418, -122.42631000048999 37.81036799899729, -122.42576700105225 37.81063099870859, -122.42501400327087 37.810824998720925, -122.42424300368275 37.81092199782699, -122.4241560029319 37.81089399790159, -122.4240650026507 37.81077299816562, -122.42406299957213 37.810635998954744, -122.4242010010643 37.810541001228984, -122.42437500542296 37.810556000650806, -122.42453200390129 37.81062700094562, -122.42484100273482 37.810619999922885, -122.42533900174496 37.81051999861559, -122.42580200170394 37.810310001112434, -122.42612700077913 37.810124000007164, -122.42638300181089 37.80990999955315, -122.42650200234326 37.809652000542215, -122.42658199976377 37.8093659993027, -122.42650700033482 37.80891399914753, -122.42632699933567 37.80826000088302, -122.4260830001641 37.807716000685296, -122.42578200256017 37.80732800166161, -122.42536100151737 37.80706999970865, -122.42489500209683 37.806860999315965, -122.42440600144118 37.806760001588614, -122.42404200337303 37.80686499928091, -122.42288200057364 37.807527997822895, -122.42248200200063 37.807827997719485, -122.42098200153883 37.80882799803966, -122.42244900117417 37.81024200095932, -122.42230300146875 37.81034199997368, -122.42138200373215 37.80962799948522, -122.42046800214149 37.80873699972544, -122.42023400002834 37.80858899970268, -122.4199140042186 37.80855200041425, -122.41960700253637 37.8085940010987, -122.41918200109903 37.808627999393416, -122.41807400298946 37.80869299882694, -122.41812600320026 37.80890099704543, -122.4191989993628 37.809347000945394, -122.4194030047159 37.80958499887034, -122.41948200197801 37.809927998487645, -122.41766300368221 37.809018998122816, -122.41748200392034 37.80892799706236, -122.41738200405558 37.80852800165852, -122.41628200196993 37.80852800002447, -122.41622500291291 37.808729998992746, -122.41618200210603 37.80892799900233, -122.41718200161735 37.80922800078798, -122.41803200021904 37.80979499840296, -122.41958200179675 37.81082799908727, -122.42038200169496 37.81152799938572, -122.42017500382134 37.811658001549766, -122.41968200200591 37.81142799980818, -122.41918200241825 37.811227998905146, -122.41910200194411 37.81142800021111, -122.41933000325004 37.81157800101444, -122.41956700213828 37.81173100014912, -122.41935600150981 37.81184499938158, -122.41898200208425 37.811727999280265, -122.41508200267053 37.809428000783896, -122.4150820031481 37.80962799951929, -122.41478199936304 37.80952799947711, -122.41508200043906 37.80992799903731, -122.4133820024403 37.8094279971125, -122.41498200295717 37.81052800091702, -122.41428200355095 37.810827997092275, -122.41198199991251 37.80922799957522, -122.41368200202565 37.81112799820345, -122.41278200198045 37.81142799981317, -122.41088200309798 37.80902800094631, -122.41008200282964 37.80892799938832, -122.41108200335006 37.81092800029658, -122.41068200306198 37.81102799890845, -122.40948200517096 37.80882800134803, -122.40838200074624 37.808627998947095, -122.40898199980259 37.810628000350135, -122.40848200102701 37.810827998129106, -122.40758200117207 37.80842799916644, -122.40658200064172 37.80822799976153, -122.40668200093207 37.810127998120606, -122.40598200226914 37.81012800065774, -122.40588200096187 37.80732799956383, -122.40518199951302 37.807127999427536, -122.40438200175515 37.80892799733322, -122.40388200307878 37.80872799833474, -122.40478200448331 37.806828000674194, -122.40408200116757 37.80652799907868, -122.4026820006834 37.8082280010181, -122.40228200304813 37.807927999452055, -122.40348200249693 37.806527998952134, -122.40278200300023 37.80612799739464, -122.40138200245676 37.807328000487004, -122.40088200178512 37.80692799907081, -122.40228200083449 37.80512799859911, -122.40128200296809 37.80442799906462, -122.39938200154478 37.80562800117097, -122.39898200038397 37.80532799975319, -122.40108200196973 37.80382800063864, -122.40068200349651 37.803527999393836, -122.3985820026859 37.804728001033844, -122.39818200447682 37.80432799920873, -122.40038199948873 37.80312800015995, -122.39998200352227 37.802727998772774, -122.39778200372632 37.80392799885445, -122.39748200009281 37.80352800073433, -122.39968200312713 37.80232800050047, -122.3993820024333 37.801927998745356, -122.39718199962128 37.80312799907564, -122.3960820021795 37.80192799975921, -122.39818200293983 37.80062799963719, -122.39788200263072 37.80042799800635, -122.39578200201187 37.8015280001348, -122.39538200030101 37.80102799691871, -122.39740900392295 37.79990999741907, -122.39707400063863 37.79952700152941, -122.39688200329475 37.799327998582314, -122.39478199979875 37.800627998114976, -122.3943819995095 37.80022799755025, -122.39650000006661 37.79887799957552, -122.39628200031818 37.798627997565234, -122.39478200010981 37.79952800107447, -122.39448200301514 37.79922799867442, -122.39598200079955 37.798327998930745, -122.39568200370483 37.79802799951658, -122.39388200002843 37.799027999220236, -122.3934820018339 37.798728001638665, -122.39518500184039 37.79763100032291, -122.39508200050959 37.79752799914492, -122.39482300261443 37.797254000036524, -122.39318200252099 37.798227997851576, -122.39288200157213 37.798028000277746, -122.3936820010726 37.79722799932066, -122.39384999902708 37.797061001657724, -122.39279199894557 37.79580199813048, -122.39142900193538 37.79650300092452, -122.39098200130167 37.79602400087213, -122.39194799919889 37.795284000299766, -122.39208100410949 37.79505400075974, -122.3918040035061 37.79472600004716, -122.39158100127526 37.79452799980941, -122.39048900219446 37.79297400026549, -122.39048100152887 37.79292800001881, -122.38995500220811 37.792433998401755, -122.38947700219029 37.79196499942784, -122.38901600344064 37.79150999904444, -122.3886699992284 37.791210999542415, -122.38662600520665 37.79186099955791, -122.38649500134741 37.79165299709037, -122.38847200436915 37.79094999967081, -122.38829100067558 37.79074200054798, -122.38819200323144 37.79044299927424, -122.38815900428267 37.79032599928725, -122.38661000272084 37.79069000169763, -122.38568100163222 37.79092799743648, -122.38636200317578 37.79031100103992, -122.38758900183208 37.78990299869877, -122.38708100381295 37.78962799712003, -122.38558100072221 37.790027999324934, -122.38528100208188 37.78962800123603, -122.38718100161894 37.78912800104117, -122.38708100400682 37.78852799969178, -122.38548100141152 37.78872799906098, -122.38528100297975 37.78842800098892, -122.38748100505539 37.78802799743795, -122.38738100311065 37.787427999024366, -122.38478100298826 37.787628000062846, -122.38438100054597 37.78592799857134, -122.38748100509576 37.78562800037083, -122.38768100377484 37.785029000398616, -122.38558100368324 37.78512799901186, -122.38548100180329 37.78472899996992, -122.38738099996841 37.78462899927413, -122.3875810024975 37.784028999402956, -122.38548100063656 37.78412899934696, -122.38554400320893 37.783652999355205, -122.38558099966966 37.78352900083257, -122.38758100036128 37.78332899781953, -122.3875810021359 37.78292899800441, -122.38498100195062 37.78322899848009, -122.38478100352661 37.78262899779265, -122.38748100042847 37.78242899711438, -122.38768100054048 37.78212899873571, -122.3847809996226 37.78222899825926, -122.38458100284107 37.78172900050912, -122.38758100146275 37.781528999908666, -122.38768100157439 37.781228998258484, -122.38448100225646 37.78122900004494, -122.38438100001355 37.78082900010581, -122.38758100024879 37.78062899938997, -122.38758100332691 37.78032900004468, -122.38448100341976 37.78022899942293, -122.38428100398453 37.779829001767936, -122.38748100285183 37.77962900032026, -122.38748100053651 37.779128999611785, -122.38478100144404 37.779228998973416, -122.3845810016376 37.77862900048365, -122.38498100495492 37.77652899979436, -122.38488100438116 37.77532899980873, -122.38697300010762 37.77513899792014, -122.38688100194653 37.774529000324314, -122.38208100134166 37.77472900108917, -122.38148099953897 37.771828999898744, -122.38478100269032 37.773328999119315, -122.38668100071823 37.77322900210503, -122.38658100413468 37.77292899865522, -122.38678100321586 37.7727289990162, -122.38598100065569 37.77072899795419, -122.38328100137531 37.76982900148583, -122.38348099966667 37.769328999499635, -122.38558100274967 37.770028999179054, -122.38478100294303 37.76872900130126, -122.3853810014332 37.767928999731446, -122.38428099949132 37.768029001283445, -122.38388100352711 37.76772899744175, -122.38358100481942 37.76772900106584, -122.38348100401487 37.767428997829235, -122.38488099877048 37.76732900025293, -122.3847810046246 37.76712899877415, -122.38298100207703 37.76712899848383, -122.38298100304644 37.76682900149942, -122.3856810026512 37.766729000253434, -122.38578100309918 37.765928997498236, -122.386381000682 37.76582899911649, -122.38658100205343 37.765529001057416, -122.38618100197264 37.76542899861758, -122.38598100204017 37.764628999607254, -122.38688100264714 37.763528998143435, -122.38658100006589 37.76322899861095, -122.3846810026246 37.76342900193591, -122.38438100034843 37.76262900057135, -122.38398100167419 37.76262900005052, -122.38418100344664 37.76412900034963, -122.38398100007154 37.764128999919556, -122.38368100081134 37.76232900154951, -122.38348100145556 37.762428998449245, -122.383481001323 37.7642290008324, -122.3832810045962 37.76422899910418, -122.38308100263275 37.7625289985425, -122.38248100007655 37.76262899984855, -122.38268100078963 37.764229000342034, -122.38248100406285 37.76422899725657, -122.38218100025782 37.76252899937626, -122.38198100125952 37.76252899829878, -122.38218100226393 37.76422900034, -122.38188100094393 37.76432900074983, -122.38148100046891 37.76242900017329, -122.38138100239881 37.76152899806727, -122.38128100453778 37.76022900041884, -122.3807810018731 37.76022899772142, -122.3801810012391 37.76032899856457, -122.38038100047584 37.76222899979029, -122.37998100028526 37.76242900116463, -122.38008100387016 37.75982899885921, -122.38098100159216 37.75972900172011, -122.38138100191718 37.75952899932885, -122.38128100354851 37.75812900034509, -122.38168100067048 37.75762899941728, -122.38128100060824 37.757928998512575, -122.38108100022775 37.75792899887474, -122.38118100252547 37.75742899947821, -122.38148100107736 37.75742899855319, -122.38148100118619 37.757129001174455, -122.38108100466249 37.75712899949236, -122.3808810047118 37.755228997728196, -122.38408100280787 37.75502899950989, -122.38418100202017 37.754629000791944, -122.38328100294888 37.75482899709797, -122.383081000964 37.75452899790471, -122.38278100357978 37.75352899815792, -122.3821810022939 37.75342899990055, -122.37998099942442 37.754129000029124, -122.38098100537408 37.7536289982537, -122.3806810047535 37.75332900066319, -122.38128100417994 37.75302899970743, -122.38008100149075 37.753028998059605, -122.38078100090516 37.75032999986354, -122.38328100346047 37.74833000007769, -122.38618100037941 37.74883000106122, -122.38638100338959 37.748029998386244, -122.38656900254024 37.74794599895971, -122.38728100050905 37.74762999899095, -122.39028100363068 37.74762999678085, -122.39033000024902 37.74764799970208, -122.39218100104436 37.74832999870972, -122.39248100264868 37.74843000071175, -122.39278100325132 37.747630000748735, -122.39228100392855 37.74753000008592, -122.39048100101438 37.74702999721216, -122.3872809996102 37.7471299982466, -122.38708100261928 37.7469299984553, -122.37648100421953 37.74752999805558, -122.37388100187076 37.74392999982807, -122.37088100309307 37.740029998798526, -122.37488100399071 37.73992999868749, -122.37598100441403 37.73863000105822, -122.3748600024833 37.73802799859348, -122.37498100328774 37.73773000056053, -122.37548100292949 37.73653000048807, -122.37588100037286 37.73572999885647, -122.37568100204008 37.73562999801859, -122.3753810011091 37.73542999971615, -122.37578100096488 37.735029997521515, -122.37588100207142 37.73493000016129, -122.37508100154446 37.73382999753916, -122.37208100141429 37.73382999798004, -122.37008100228212 37.73232999945852, -122.36731100068226 37.735437998521924, -122.36598100102422 37.736929999916185, -122.35958100255148 37.73122999880595, -122.35658100345557 37.729629998791125, -122.3568790004081 37.727926998262255, -122.35898100396813 37.7159309992258, -122.35920500221727 37.71582800112333, -122.36300400152439 37.717979000096, -122.36405700157711 37.71671300032407, -122.36031899918444 37.71453099891589, -122.3604869990745 37.714339998180236, -122.36418100174225 37.716030999159216, -122.36445400161692 37.71590399856862, -122.36515600209123 37.71586600093246, -122.365797002538 37.71569800085091, -122.36691000157974 37.71614000116991, -122.36749000380824 37.716835000065956, -122.3683749990992 37.71729199835145, -122.3693210002544 37.71742199920084, -122.37025200215047 37.71771999837413, -122.37078600309644 37.71797100039531, -122.37100000400689 37.71819299948425, -122.37180900016719 37.718208000748234, -122.37206900023882 37.71820800033235, -122.37255600075379 37.71820800045442, -122.37388100186226 37.71833100008517, -122.37400600448609 37.71836799845519, -122.37432600092178 37.718825998649535, -122.37458100088416 37.71913099909172, -122.37538100289382 37.7201309981441, -122.37594400363257 37.720596000191655, -122.37628100145147 37.72113100072789, -122.37655400485536 37.721533997659456, -122.37678300183003 37.72283799919967, -122.37644700235849 37.723424999852334, -122.37565400342264 37.72394400009326, -122.37513500202245 37.724165996923766, -122.3749210016768 37.72452399927996, -122.3753030035443 37.72504299961002, -122.3759890007999 37.72546299955572, -122.37641700305407 37.72566099851037, -122.37638800056448 37.72556699685407, -122.37626400356918 37.72516499970941, -122.37621800413196 37.724546998294315, -122.37692000286846 37.72401300121194, -122.37769800141137 37.72388299953827, -122.378644002854 37.723852998821556, -122.37962100351741 37.72364699817849, -122.37988000086791 37.72330400038069, -122.37985000189839 37.72237399957633, -122.38017000005064 37.721716999839416, -122.38081100400863 37.721320998428844, -122.38169600380739 37.7216109984025, -122.38228100052731 37.722430999310056, -122.38358100277931 37.72333100114615, -122.38548100231085 37.72463099985458, -122.38548100305798 37.72410999992389, -122.38548100066974 37.723631000398726, -122.3836810008543 37.7221310011884, -122.3838809993407 37.72043100053291, -122.37598100395087 37.71583100002218, -122.38032300097892 37.71176899786434, -122.38091800276834 37.71113599859713, -122.38087200259056 37.710868998897006, -122.38067400199903 37.71050199905901, -122.38064300287287 37.71010600048759, -122.3797890009414 37.70958699962468, -122.37948100130123 37.709330998483495, -122.37884300388629 37.70958699956996, -122.37801900432609 37.70983100056975, -122.37748500284071 37.710182000029086, -122.3771190029366 37.71072399948657, -122.37679800158955 37.71098299894151, -122.37644700001529 37.710670000865356, -122.37585200015609 37.71038800013021, -122.37542500168442 37.71020499874214, -122.3753030046978 37.70986899869558, -122.37550100339695 37.7095640007587, -122.37798799964864 37.70926599998801, -122.37981900007551 37.70922099886038, -122.38259100306723 37.709620997973786, -122.38267300343178 37.709633001611955, -122.38589300190183 37.709724000924474, -122.38748200412721 37.708330998039834, -122.38918199934251 37.70923100094126, -122.3900739993525 37.708502999514096, -122.39097500189953 37.70908300014288, -122.39225600022444 37.70854900070593, -122.39268200258347 37.70833099920913, -122.39378200067442 37.70823100010338, -122.39518200278752 37.708330999662124, -122.39578199970053 37.70833099906829, -122.40028199984226 37.70833099708804, -122.40098200054992 37.70833099914296, -122.40148200212309 37.70833100118608, -122.4019820003053 37.70823099895563, -122.40238200217588 37.70833099901089, -122.40268200108827 37.708330998606925, -122.40298200345678 37.70833099738305, -122.40328200250975 37.70833100094259, -122.40468200138245 37.70833100098883, -122.40522600169444 37.708270999425295, -122.40545300188073 37.7082449973271, -122.40558200081756 37.708230999559575, -122.40648200296454 37.70833099852555, -122.40718200118408 37.70823099991518, -122.41037000050905 37.70828299783252, -122.41328199980558 37.70833099874258, -122.41399700393939 37.708251999034125, -122.4141819998054 37.708231000082975, -122.41448200254572 37.70823099977452, -122.41518200089914 37.70823099712991, -122.41608200065352 37.7083309999109, -122.41708200351668 37.708330997302866, -122.41818199974108 37.70833099765228, -122.4191820009095 37.708230999385535, -122.42008199982422 37.70823099932169, -122.4223820013107 37.70823099898049, -122.42378200071062 37.708230999979925, -122.42529000244802 37.708300999459794, -122.42556500232993 37.7083139991297, -122.4257920025859 37.708324000457154, -122.42588200330975 37.70832800075177, -122.42598100233026 37.70833299931143, -122.42636600358212 37.70835099696509, -122.42683300511959 37.708372997452855, -122.4271070030497 37.708386001682314, -122.4280820022819 37.7084309994753, -122.42918200418339 37.708330998620404, -122.43002700500693 37.70828099931173, -122.43088200072462 37.70823100037583, -122.43218200087541 37.70823199975523, -122.43338199947453 37.70823200030901, -122.43398200263503 37.70813199989724, -122.43538200027704 37.70813200037619, -122.43588200127036 37.70823199943508, -122.43658200304685 37.708232001418374, -122.43818200039966 37.70823199829001, -122.4386820018935 37.70823199899797, -122.43914400124224 37.70827800027672, -122.43968200375281 37.708331999845356, -122.44008200211572 37.70833199883975, -122.44068200153147 37.7083320002477, -122.44078200202205 37.70833200063336, -122.4412820009472 37.708331998599476, -122.44208200298726 37.708231997671746, -122.44313800128221 37.708275999552555, -122.44368800013991 37.70829900050609, -122.44448200404116 37.70833199944038, -122.44538300202254 37.708231997949525, -122.44638300320416 37.708231999321306, -122.44698300267449 37.70823200158391, -122.44713400259536 37.7082319995605, -122.44928300048775 37.708231997971936, -122.44958300220287 37.708232001681175, -122.45174000130085 37.70814899904779, -122.45218300145373 37.70813199876923, -122.45398300395927 37.70823199890754, -122.45598300337267 37.70823199827751, -122.45708300073667 37.70823200066371, -122.45828300248971 37.708232000215055, -122.45978300336884 37.70823199828025, -122.46138300215941 37.708231998639484, -122.46228299917678 37.70823200031735, -122.46278300326381 37.708231997572085, -122.46358300293635 37.708332000136814, -122.46408300401478 37.70823199934587, -122.46448300054622 37.70823199791219, -122.46538300293756 37.708131999677455, -122.46628300292588 37.708131998193814, -122.46718300090454 37.70823200175634, -122.46788300052687 37.70823199906815, -122.46888300324042 37.70823199894551, -122.47088300240993 37.70823199805963, -122.471078002629 37.70826399773793, -122.47148300170781 37.7083320001422, -122.47198300064478 37.70833199964788, -122.48108300404334 37.708231998976565, -122.48378300324008 37.708331997212404, -122.48488300201318 37.70833199821316, -122.48511700385721 37.708284998125976, -122.48538300111585 37.70823199996127, -122.48563100252915 37.708232000038535, -122.48608300020194 37.70823199900761, -122.49358300166179 37.708232000682806, -122.49778300151455 37.708131998364394, -122.50248400097321 37.708132000222115, -122.50308300140364 37.7102319973883, -122.50498300393406 37.71713199822134, -122.50648300277095 37.72373099922853, -122.50758300080953 37.735330999022864, -122.50788300235479 37.73853099855516, -122.508083001515 37.74023099972811, -122.50938299924294 37.748829999406105, -122.51038300474755 37.75112999973127, -122.51118300327896 37.759729999820976, -122.51178300294251 37.763730000494725, -122.51198300189263 37.771130001673605, -122.51318300047525 37.777228999879334, -122.51378300161674 37.777628998724175, -122.51418300215099 37.77792899932094, -122.51428300412516 37.77942900016536, -122.51398300210553 37.78002899883501, -122.51448300362426 37.78082899981024, -122.51378300218984 37.781328999482945, -122.51328300290322 37.7825290017033, -122.51298300190979 37.78262899920042, -122.51248300215399 37.783528998159056, -122.51258300513095 37.78392899960029, -122.51198299962637 37.78372900120513, -122.51108300433201 37.78462900107028, -122.51078300177353 37.78432899859026, -122.5098829995426 37.78482900038453, -122.5092830012074 37.785328997582376, -122.50768300205557 37.78612900007726, -122.50708300061494 37.787129001701466, -122.5068360021144 37.787237000133615, -122.50617900515972 37.78761799915532, -122.50531000056161 37.78831200169436, -122.50390600217291 37.78847300046987, -122.50257800357896 37.78859499783952, -122.5010980035962 37.788960998516494, -122.50021300198702 37.78912099703697, -122.49879400121867 37.78886199785555, -122.49787900262069 37.788243999610145, -122.49707000242661 37.787724998699034, -122.49575799906141 37.7879920008386, -122.49458300156691 37.788129000345435, -122.49298300322864 37.787828998386395, -122.4928830030261 37.78792899943103, -122.49028300307116 37.78892899933857, -122.48958300163612 37.78942899889492, -122.48728300210053 37.78942899982554, -122.48578300261157 37.79062899934947, -122.48568700376812 37.79086799913919, -122.48530500319745 37.791859998183384, -122.48454199937417 37.79320299946908, -122.48370300339721 37.79459099934212, -122.48287900343733 37.79678099894519, -122.48252800135984 37.798498001831426, -122.48187200295274 37.79997000081161, -122.48089600306295 37.80112199866086, -122.48004100372268 37.802693001468505, -122.47973600215882 37.80413499891557, -122.4793540006034 37.805798001114184, -122.47920200420903 37.80731599850924, -122.478729004377 37.808430000002986, -122.47855700330167 37.80886500051456, -122.47846900224826 37.80908599809431, -122.47845400133515 37.80964299845812, -122.47837800518256 37.810092998733865, -122.47831700206072 37.81026099988801, -122.47808300319899 37.81082799942256, -122.4773830027518 37.8110279985324)), ((-122.36716300201184 37.82919299870149, -122.36708100308483 37.828626999020436, -122.36608100344952 37.82722699751663, -122.36448099985428 37.82482699935063, -122.36388100235804 37.823926998982515, -122.36298100271658 37.82252699870859, -122.3631010011084 37.821983999879855, -122.36197200255542 37.82156299884589, -122.3618450017319 37.82151499962196, -122.36188000319203 37.82141999920142, -122.36196900118861 37.82117899972746, -122.36316099993647 37.821531998807025, -122.36335099975614 37.82154299841797, -122.36349400314681 37.82110299719114, -122.36329000009967 37.82110300044251, -122.36211100336571 37.82079499979478, -122.3622000041691 37.82055499990635, -122.3635809993485 37.82088799957749, -122.36377700350232 37.82039899987753, -122.36237300107163 37.820088000108996, -122.36246200102853 37.81984899983388, -122.36402200304387 37.82007699826305, -122.36413100002821 37.81972099950943, -122.36418100352819 37.819726999520334, -122.36405600378356 37.81897899957584, -122.36501000387662 37.818480999660906, -122.3648410003743 37.818272001247166, -122.36401200236367 37.81871599902412, -122.36399600413233 37.818619999082856, -122.36471900195463 37.81824999926998, -122.36451599955413 37.818067998389196, -122.36468100183514 37.81792699772325, -122.36528100188069 37.8185270014068, -122.36910000184953 37.816880998187706, -122.37108100133965 37.81602699760978, -122.37103099927924 37.81597399889649, -122.36768100331768 37.812427999528026, -122.35918100265783 37.81502699846614, -122.3585810017319 37.81452699699352, -122.35928100047441 37.81372700057154, -122.36088100204721 37.812527998422134, -122.36058100127595 37.81152800040237, -122.36058100256236 37.81122799902944, -122.361179004291 37.81040600068494, -122.36138100016014 37.81012800024669, -122.36202100317063 37.80827199946075, -122.36238100292474 37.807227997360634, -122.36650700180623 37.807641000840526, -122.36738100030271 37.80772799774915, -122.36788100083449 37.8079279987201, -122.37280000311387 37.81110000054927, -122.37218100101293 37.81162800002688, -122.37168100183798 37.8151279988098, -122.37206800234647 37.815715999050425, -122.37938100032665 37.82682699963543, -122.37919899994881 37.82737100120159, -122.37880800045005 37.82854399813841, -122.37849999993679 37.829469000661334, -122.3781809998623 37.83042699947194, -122.37328100448337 37.832427001568426, -122.36878100241123 37.83122700093247, -122.36748100402188 37.83012699761026, -122.36744800353966 37.829471999042084, -122.36721700326247 37.82953700093724, -122.36722900287971 37.829736999203334, -122.36723700117226 37.82987499830906, -122.3671230027669 37.82988599902703, -122.36696600058231 37.829558001127424, -122.36681100337452 37.82963900024869, -122.36686500153053 37.82996599685507, -122.36671700169016 37.829966001168025, -122.36662800082523 37.829702996984665, -122.36626300072002 37.82982099985441, -122.36666900365388 37.830695998775276, -122.36648100285902 37.83082699870815, -122.365481001984 37.82902700074701, -122.36589000274783 37.828924999716435, -122.36623500032792 37.829482999873434, -122.36645800200628 37.82939199835599, -122.36636400279204 37.82908000045255, -122.36656700195826 37.82903700057768, -122.36666900302782 37.8293540001057, -122.36690899926681 37.82924200103523, -122.36684500066215 37.82897300048164, -122.36694600345203 37.82889699802432, -122.36703400094656 37.82921399912601, -122.36716300201184 37.82919299870149)), ((-123.00111000270672 37.695192001133705, -123.00253100370638 37.695444997581575, -123.00345300210193 37.695788997220745, -123.0036830019372 37.69624699811416, -123.00365400082204 37.69670400019134, -123.00417200143359 37.697298996974595, -123.00466100074497 37.69739099868985, -123.00506500054387 37.69723099761218, -123.00518100144767 37.69661400067909, -123.00532500004532 37.696568001206245, -123.0054690022101 37.696660001281806, -123.00589999986454 37.69741499915903, -123.00627500045292 37.69752899792517, -123.0067930007235 37.69743799841458, -123.00745600087302 37.69764400026236, -123.00768600470286 37.697621999853006, -123.00817600059604 37.697712998813, -123.00823399988722 37.69789700135954, -123.00811800235707 37.69837699902433, -123.00699400181824 37.69826199835657, -123.00670599881865 37.69833099757466, -123.0064460005627 37.69904000146314, -123.00630200296706 37.69913199844544, -123.00558200266919 37.69908600007524, -123.00555300219983 37.69947499730165, -123.00538000266606 37.69977299986132, -123.00489000354236 37.699748999922235, -123.00480300234969 37.6999549984153, -123.00486100008541 37.700046998984696, -123.00514900227475 37.70029900012896, -123.00523500220736 37.70064199944918, -123.00500400361085 37.70114599841793, -123.00451400222191 37.70169499896748, -123.00382300150615 37.70151199937571, -123.00376500032873 37.70139699753734, -123.00388100244318 37.700847999664305, -123.00382300262318 37.70089400107738, -123.00341999953797 37.700709999069915, -123.00316100181021 37.70043500004306, -123.00229600134932 37.70045799736332, -123.00177700390975 37.7007319988998, -123.00111000086969 37.70047999881921, -123.0007709996588 37.69981599838822, -123.00035500224233 37.699701998171406, -123.0002400028299 37.69961099977336, -123.00021100089211 37.69942699894536, -123.00012400366941 37.699382000982546, -123.0000960014585 37.69917600007811, -122.99989399912178 37.69899299898149, -122.9998930003183 37.69851199989493, -123.00009500211924 37.698328998595734, -123.00003700513636 37.697985000303, -122.99951800227655 37.697252996765705, -122.99946000173628 37.69697800080927, -122.99954600208707 37.696818001868245, -122.99992100054465 37.69681799722664, -123.00023800369298 37.697068998544886, -123.00052600279011 37.69711500024993, -123.00049700203641 37.696886000487616, -123.0002660035395 37.696610999776055, -123.00023700378298 37.69638199993202, -123.00068200255717 37.69587999884913, -123.00069700305475 37.69510099745651, -123.00111000270672 37.695192001133705)), ((-122.33228100158115 37.78812799980015, -122.32768100246903 37.7808279987132, -122.33108100038895 37.78092800001, -122.33201400085392 37.78174300011748, -122.33208100216164 37.78252800069501, -122.33228100158115 37.78812799980015)), ((-123.01347600092876 37.70000499958788, -123.01327399998544 37.70032599968385, -123.01310100316718 37.70032599944024, -123.01266900112327 37.70007399918597, -123.01200600443049 37.700256000124845, -123.01154600007322 37.69979799984658, -123.01094100228752 37.699752001529404, -123.01068100273703 37.699637999483656, -123.01048000409288 37.69924800015336, -123.00990400166039 37.69906499944825, -123.00921300023637 37.698697999729966, -123.00895400234675 37.69840099845557, -123.00912700433157 37.69787400040528, -123.00953000015231 37.69757699987076, -123.009876003743 37.697484998231424, -123.01076900283611 37.69766899855054, -123.01091300262294 37.69757700135604, -123.01102900395891 37.697348999451464, -123.01111600061583 37.696616999641336, -123.01126000028974 37.696548000131685, -123.01157700000833 37.69659399929213, -123.01200899954962 37.696822999557135, -123.01252700154807 37.6974869994908, -123.01241200351207 37.69767099901392, -123.01192200134717 37.697966997496025, -123.01195000148375 37.698286998172236, -123.01272800143067 37.69812800014598, -123.01318900158269 37.6982190010152, -123.0134480021094 37.6984709997004, -123.01333300237832 37.698837999284144, -123.01339000252416 37.69904399703684, -123.0137640047515 37.69943299926414, -123.01347600092876 37.70000499958788)), ((-122.42168200482519 37.82512699811889, -122.42248200258321 37.82572699827223, -122.42438199964535 37.82672699952341, -122.4253820042754 37.82842699881472, -122.42408200204082 37.828626998261335, -122.42368200007947 37.82832699863389, -122.42248200292694 37.82772699886725, -122.42148200138759 37.82722700189391, -122.42048200318922 37.82562699879474, -122.42168200482519 37.82512699811889)), ((-122.41988200267829 37.86042600167199, -122.42038200312092 37.863425998396984, -122.41868200325594 37.86212599890519, -122.41988200267829 37.86042600167199)), ((-122.51620500120494 37.778630997843884, -122.51602200234255 37.778738001488776, -122.51596100276053 37.77882199772567, -122.51588400459404 37.77892899870388, -122.51571700453583 37.778889998592945, -122.5155950031508 37.77885199795617, -122.51547300192932 37.77870699983636, -122.51548800278552 37.77858499751948, -122.51554900052089 37.778486000339115, -122.51541200137721 37.77840199965234, -122.5153050016908 37.778248998436744, -122.51525800475858 37.778051000620074, -122.51518300193746 37.77792899856511, -122.51530400196889 37.77785299809064, -122.51553400215212 37.777852999687056, -122.51565600057395 37.77782999836805, -122.51576299947232 37.7777310001477, -122.51591600237066 37.77768500090721, -122.51616000351468 37.77769999931958, -122.51631200399288 37.7777989992603, -122.51648000297844 37.77774599897571, -122.51661699989134 37.777776001192485, -122.51672400322026 37.777822001052066, -122.51686199994343 37.7778529995177, -122.51701400279772 37.777890999212225, -122.51708400342807 37.77792899810001, -122.51719700218675 37.77806600111926, -122.51712100220476 37.77818799809985, -122.51702900152384 37.7783179987757, -122.51675500214021 37.77843999873117, -122.5164650040058 37.778538999322315, -122.51620500120494 37.778630997843884)), ((-123.10750800002835 37.77175499844693, -123.10713300119025 37.772075000678605, -123.10690300499496 37.7720069985708, -123.10667200348486 37.77168599871044, -123.10649800205516 37.771137000056626, -123.10664199927072 37.77074800015684, -123.10713200321936 37.77047299935116, -123.10788200488287 37.77038099905438, -123.10802700119842 37.77054100047744, -123.10785399971677 37.77077000037584, -123.10716200491966 37.77102199829375, -123.10704600363734 37.77113699900525, -123.10707500012644 37.77129700120687, -123.1075080025883 37.77150300080888, -123.10750800002835 37.77175499844693)), ((-123.00359900339797 37.693247999354604, -123.00408899876726 37.69306499862042, -123.00437700229187 37.6931109988056, -123.00452100156134 37.693247998974776, -123.00440500436073 37.693568997822425, -123.003915001201 37.693888996984825, -123.00270499937712 37.69400299767456, -123.00256099979113 37.693774000094244, -123.00273400329216 37.6934989993888, -123.00359900339797 37.693247999354604)), ((-122.51702900310198 37.7803249983712, -122.51692300098473 37.7801719979865, -122.51684600324721 37.780049998117946, -122.51678400171873 37.77992899977482, -122.51678500347093 37.77978300022179, -122.51693800149978 37.7796679985835, -122.51712100146293 37.77965300067621, -122.51739600164501 37.779675998345205, -122.51747200214444 37.77971400148302, -122.51753300039019 37.779782999953966, -122.51768400315288 37.77982899866695, -122.51776200283318 37.77989000078505, -122.51791400173342 37.779957999157986, -122.5178990026191 37.780057001372505, -122.5178380022953 37.78021799902128, -122.51785300213275 37.78043900005264, -122.5177920021776 37.78059200121239, -122.51757900159674 37.780636999560066, -122.51739599946208 37.78069799899181, -122.51718400257755 37.78072900009292, -122.51712100190518 37.78058399948273, -122.51707500125039 37.780447001093464, -122.51702900310198 37.7803249983712)), ((-123.00384999931511 37.70327499888906, -123.00370600358042 37.70361799885165, -123.00341700019264 37.70393799796547, -123.00304200342546 37.704097999556716, -123.00269699971521 37.70398399813806, -123.00269700188949 37.70359500037963, -123.0031010014828 37.70320599860266, -123.0036770011639 37.702954000321306, -123.00384999931511 37.70327499888906)), ((-123.0995460027699 37.76754499818914, -123.09940200208133 37.7677279998736, -123.09925800293647 37.7677739994194, -123.09853700426883 37.7675450000933, -123.0983630002649 37.767041998606416, -123.0985080038332 37.76692699898709, -123.09899800484664 37.76695000140429, -123.0995460027699 37.76754499818914)), ((-123.0976120009704 37.76365299829198, -123.09781400411646 37.7636309981189, -123.09824700102378 37.76374499809887, -123.09865000463316 37.7636299997279, -123.09888100205548 37.76374499778605, -123.09899700129891 37.76401900156274, -123.098881001601 37.76408799729315, -123.09859300441137 37.76411099699873, -123.09764099928047 37.76404299845965, -123.09749700120625 37.76383699882377, -123.0976120009704 37.76365299829198)), ((-123.1009589991204 37.7665369989823, -123.10087200035517 37.76674299873064, -123.10044000465767 37.76692599934588, -123.09989200481239 37.76685799960913, -123.09954600442887 37.766628997915234, -123.09983400209302 37.76646899962273, -123.10061299984976 37.76646899815792, -123.1009589991204 37.7665369989823)), ((-123.03262600138423 37.72711900043118, -123.03251100163565 37.72739400061298, -123.03219400197824 37.727554000217786, -123.03196300239823 37.72755399836176, -123.0318480038114 37.72730199811071, -123.03207900402207 37.72695799971722, -123.03239600120155 37.72691299967207, -123.03262600138423 37.72711900043118)), ((-122.3807810018731 37.76022899772142, -122.38098100235005 37.76212899818279, -122.38068100111626 37.76222900117723, -122.3807810018731 37.76022899772142)), ((-123.00431200200299 37.7021069997962, -123.00454200320593 37.70226799899519, -123.00439800159984 37.702564998381966, -123.00411000053423 37.702633999001435, -123.00399500472496 37.70242799703011, -123.00399500231723 37.702221997682166, -123.00408100409145 37.7021300007244, -123.00431200200299 37.7021069997962)), ((-123.00563700316685 37.70277199925328, -123.00563700322542 37.7028629971686, -123.00557900151968 37.703068999958006, -123.00526200314758 37.70318400116316, -123.00517600268698 37.70302300049713, -123.00546400140104 37.70265699995382, -123.00563700316685 37.70277199925328)))"

sf_df = sedona.sql(f"""
SELECT * FROM places 
WHERE ST_Contains(
    ST_GeomFromWKT('{SF}'), 
    geom)
""")

sf_df.createOrReplaceTempView("sf_places")

In [None]:
sf_df.show(10)

# Mapping the data

In [None]:
sf_map = SedonaKepler.create_map(sf_df, "Places")
sf_map

# Searching by business name

Let's search for Starbucks locations within San Francisco.

In [None]:
sbux_df = sedona.sql("""
SELECT *
FROM sf_places
WHERE LOWER(name) = 'starbucks'
""")

sbux_df.createOrReplaceTempView("sbux")

sbux_map = SedonaKepler.create_map(sbux_df, "Starbucks in San Francisco")
sbux_map

# Foursquare Categories

The Foursquare data assigns categories to most places, but not all. You can find more detail about the [categories in the Foursquare docs](https://docs.foursquare.com/data-products/docs/categories).

In [None]:
sedona.sql("""
SELECT 
    fsq_category_labels[0] as primary_category,
    COUNT(*) as count
FROM sf_places
GROUP BY fsq_category_labels[0]
ORDER BY count DESC
LIMIT 20
""").show(truncate = False)

## Subset by category

In [None]:
category_places = sedona.sql("""
SELECT * FROM sf_places
WHERE ARRAY_CONTAINS(fsq_category_labels, 'Dining and Drinking > Cafe, Coffee, and Tea House > Coffee Shop')
""")

category_places.createOrReplaceTempView("category_places")

category_places.show(10)

In [None]:
map_category = SedonaKepler.create_map(category_places, "Places")
map_category

# Aggregation: Counting coffee shops in each neighborhood

To do this, we need a dataset of polygons for the SF neighborhoods. We have one from the city's [open data portal](https://www.sf.gov/departments--city-administrator--datasf) that we have loaded into an S3 bucket as a CSV file.

In [None]:
SF_NEIGHBORHOODS_URL = "s3://wherobots-examples/data/sf_neighborhoods.csv"

neighborhoods = (sedona.read.format('csv')
    .option('header', 'true')
    .option('delimiter', ',')
    .option('inferSchema', 'true')
    .load(SF_NEIGHBORHOODS_URL)
    )

neighborhoods.show(10)

We can aggregate the count of coffee shops from Foursquare into these neighborhoods using a Spatial SQL query. `ST_GeomFromWKT` converts the text in the CSV file to a native geometry value.

In [None]:
neighborhoods.createOrReplaceTempView("neighborhoods")

neighborhood_agg = sedona.sql("""
    SELECT 
        ST_GeomFromWKT(n.the_geom) AS geometry,
        n.neighborho AS neighborhood,
        COUNT(*) as location_count,
        collect_list(p.name) AS coffee_shops
    FROM category_places p
    JOIN neighborhoods n
    ON ST_CONTAINS(ST_GeomFromWKT(n.the_geom), p.geom)
    GROUP BY n.the_geom, n.neighborho
    """)

In [None]:
neighborhood_agg.show(10)

We'll use Sedona Kepler again, this time to visualize the data as a choropleth map. The darker the color, the more coffee shops there are in the neighborhood. We can see the downtown business district stands out in the eastern half of the city.

In [None]:
import json
with open('assets/conf/map_config_foursquare.json', 'r') as file:
    map_config = json.load(file)

map = SedonaKepler.create_map(df=neighborhood_agg, name="Coffee Shop Count", config=map_config)
map

# How is the Foursquare Places dataset different than other POI datasets?

The Foursquare Places dataset is continually checked and updated by the community though the Foursquare Placemaker Tools.

Foursquare's Placemaker Tools are a set of web-based tools that let individuals contribute to the Foursquare Places dataset and validate the accuracy of places in the dataset. 
These tools, currently in beta, allow users to add new venues, update existing information, and validate place data. 

Key features include:
- Simple process for adding or editing places.
- A collaborative system for verifying edits to ensure accuracy and transparency.
- Integrates verified community insights immediately into the global Places dataset, ensuring it remains current and reflective of real-world changes.

By leveraging community knowledge, Foursquare's Placemaker Tools enhance the quality and accuracy of Foursquare's Places data.


<img src="./assets/img/placemaker_tools_2.png" alt="foursquare placemaker tools" width="800"/>

# Snapshots and versions

The Wherobots Open Data catalog keeps mutliple versions of the Foursquare data, snapshots taken of each release. Havasu, the Wherobots optimization of Iceberg GEO, supports built-in tags that are used for versioning.

We can query what versions exist like this:

In [None]:
sedona.sql("SELECT * FROM wherobots_open_data.foursquare.places.refs WHERE type = 'TAG' ORDER BY name DESC").show()

By default, queries will return data from the latest snapshot. If you want to query a specific snapshot, you can use the `VERSION AS OF` clause with a Foursquare release name. For example, we could look at the February 2025 snapshot.

In [None]:
sedona.sql("SELECT * FROM wherobots_open_data.foursquare.places VERSION AS OF 'dt=2025-02-06'").show(5, truncate = True)