Skip to content

Conversation

@abhinavk96
Copy link
Contributor

@abhinavk96 abhinavk96 commented Jun 17, 2019

Fixes #6054

Short description of what this resolves:

Adds a cron job, which populates and refreshes the event locations every 7 days with the most popular event locations.

@codecov
Copy link

codecov bot commented Jun 17, 2019

Codecov Report

Merging #6056 into development will decrease coverage by 0.05%.
The diff coverage is 30%.

Impacted file tree graph

@@              Coverage Diff               @@
##           development   #6056      +/-   ##
==============================================
- Coverage        66.15%   66.1%   -0.06%     
==============================================
  Files              285     285              
  Lines            14060   14080      +20     
==============================================
+ Hits              9301    9307       +6     
- Misses            4759    4773      +14
Impacted Files Coverage Δ
app/api/event_locations.py 51.72% <30%> (-48.28%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cb19e9b...21403bb. Read the comment docs.

.group_by(Event.searchable_location_name)\
.order_by(desc('counts'))\
.limit(5)
db.session.query(EventLocation).delete()
Copy link
Member

@mrsaicharan1 mrsaicharan1 Jun 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine but just one doubt. Why are we running a delete on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

umm.. because if we only keep adding locations and not removing them we will have multiple copies of the same location. Every week say delhi will keep getting added. @mrsaicharan1

mrsaicharan1
mrsaicharan1 previously approved these changes Jun 18, 2019
uds5501
uds5501 previously approved these changes Jun 18, 2019
Copy link
Contributor

@uds5501 uds5501 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good

Copy link
Member

@iamareebjamal iamareebjamal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query you wrote of fetching the most popular events is not that expensive that we cache that in other table and run a cron for every 7 days. Just use that query directly

Secondly, the logic of taking serachable location name is wrong.

Suppose, there are 3 events

AMU, Aligarh
Dodhpur, Aligarh
Railway Road, Aligarh

All of them would have a count of 1 and Aligarh will never be shown. For this to work, you need exact address matches

Also, the location should be city and this will return address. Like Chandni Chowk, Delhi instead of Delhi

@abhinavk96
Copy link
Contributor Author

abhinavk96 commented Jun 18, 2019

Secondly, the logic of taking a serachable location name is wrong.
Suppose, there are 3 events
AMU, Aligarh
Dodhpur, Aligarh
Railway Road, Aligarh
All of them would have a count of 1 and Aligarh will never be shown. For this to work, you need exact address matches

@iamareebjamal I don't agree with this inference. AMU, Aligarh, Dodhpur, Aligarh, Railway Road, Aligarh will all have the searchableLocationName as Aligarh hence the count for Aligarh will be 3.
I am at no point using the address field which stores the entire address. I am grouping the events by their `searchableLocationNames. Exact address matches don't come in the picture? Please correct me if I'm wrong.

Also regarding, caching, do we need a separate endpoint for this then? What will we use /event-locations for as a use case, as per the discussion in the parent issue my understanding was that /event-locations should return the most recent locations.

Also, the location should be city and this will return address. Like Chandni Chowk, Delhi instead of Delhi

It does not.

Edit: I have printed the results after running the query on the production dataset, they were result objects of the form (Delhi, 21), (Singapore, 5) etc.

@prateekj117
Copy link
Member

@iamareebjamal The problem was that searchabeLocationName was not being filled on creating an event. @CosmicCoder96 solved it (fossasia/open-event-frontend#3150). So, this logic would work and it will only return the city name.

prateekj117
prateekj117 previously approved these changes Jun 18, 2019
Copy link
Member

@prateekj117 prateekj117 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@abhinavk96
Copy link
Contributor Author

abhinavk96 commented Jun 18, 2019

@prateekj117 This implementation is agnostic of the PR you mentioned. If searcable location was null or is null which it might as well be due to various reasons, this query will (and should) return nothing.

@iamareebjamal
Copy link
Member

OK, if searchable location name is of city, this will work, but we still don't need a cron job for this. Please, just use the query directly. And no, we don't need a separate endpoint for this, use the event-locations endpoint only, just no need of running a cron and saving in a separate table

@prateekj117
Copy link
Member

@CosmicCoder96 It's confusing to me that searchable location name were not all null in production database, and returning saving nothing in case of development database.

@abhinavk96
Copy link
Contributor Author

@prateekj117 fossasia/open-event-frontend#3150 (comment)

@iamareebjamal
Is this okay ? Same logic, but on every api call. Or are you talking about something else.

class EventLocationList(ResourceList):

    """
    List event locations
    """
    def before_get(self, args, kwargs):
        popular_locations = db.session.query(Event.searchable_location_name, func.count(Event.id).label('counts')) \
            .group_by(Event.searchable_location_name) \
            .order_by(desc('counts')) \
            .limit(5)
        db.session.query(EventLocation).delete()
        for location, _ in popular_locations:
            if location is not None:
                new_location = EventLocation(location)
                db.session.add(new_location)
        db.session.commit()

@iamareebjamal
Copy link
Member

iamareebjamal commented Jun 18, 2019

popular_locations = db.session.query(Event.searchable_location_name, func.count(Event.id).label('counts')) \
            .group_by(Event.searchable_location_name) \
            .order_by(desc('counts')) \
            .limit(5)
locations = []
for location, _ in popular_locations:
            if location is not None:
                locations.append(EventLocation(location))
return locations

No need of delete and insert

@abhinavk96
Copy link
Contributor Author

abhinavk96 commented Jun 18, 2019

@iamareebjamal But popular_location above just contains a query string and result sets are of the format (eventlocationName, count), how will this initialize the objects returned by the query with EventLocation class which we need it to return corresponding to EventLocations api schema?

@iamareebjamal
Copy link
Member

@CosmicCoder96 Updated the command

@abhinavk96 abhinavk96 dismissed stale reviews from prateekj117, uds5501, and mrsaicharan1 via 5350111 June 18, 2019 14:16
@abhinavk96 abhinavk96 changed the title feat: Add cron job to update popular event locations feat: List popular event locations at /event-locations Jun 18, 2019
new_location.id = len(locations)
locations.append(new_location)
schema = EventLocationSchema()
result = schema.dump(locations, many=True).data
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iamareebjamal Had to make these changes to keep it JSON-API compliant. Please review.

@abhinavk96 abhinavk96 requested a review from iamareebjamal June 18, 2019 14:17
iamareebjamal
iamareebjamal previously approved these changes Jun 18, 2019
@iamareebjamal
Copy link
Member

Travis failing

@abhinavk96
Copy link
Contributor Author

@iamareebjamal Fixed now.

@iamareebjamal iamareebjamal merged commit 8f2de81 into fossasia:development Jun 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Populate event locations with most frequent locations routinely

5 participants