-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include timestamps in download data #2416
Comments
We have it. There is no SQLite that I know of going on. I'd have to rewrite On Saturday, January 17, 2015, Steve Purcell notifications@github.com
|
I was pretty sure the logs get slurped into sqlite, and then results queried from there get splooged into downloads.json. |
Did you make that change? Cause I didn't. We talked about it but then I On Sunday, January 18, 2015, Steve Purcell notifications@github.com wrote:
|
Looks like you did it in October: 8acf077 LOL. |
OMG MY MEMORY IS GOING!!!!! Arrrgh Ok On Sunday, January 18, 2015, Steve Purcell notifications@github.com wrote:
|
You must have been on a 3-day coding vision quest at the time, and now you have only patchy memories... |
Yes, now I remotely remember a conversation about how I probably didn't On Sunday, January 18, 2015, Steve Purcell notifications@github.com wrote:
|
And you were right! |
Hi people, Some thoughts about this:
Here is some code that shows a prototype of how this might work: import collections
import datetime
import json
import random
import sqlite3
import string
import time
CREATE_QUERY = "CREATE TABLE pkg_ip_time (pkg STRING NOT NULL, ip INT, dl_time DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP)"
COUNT_QUERY = "SELECT pkg, DATE(dl_time), COUNT(ip) FROM pkg_ip_time GROUP BY pkg, DATE(dl_time)"
def main():
connection = sqlite3.connect(':memory:')
cursor = connection.cursor()
print("0. Creating table")
cursor.execute(CREATE_QUERY)
print("1. Creating random records")
packages = string.ascii_letters
stats = [(random.choice(packages), random.randint(0, 2 ** 32),
datetime.datetime.now() - datetime.timedelta(seconds=random.randint(0, 60*60*24*30)))
for _ in range(100*1000)]
print("2. Inserting created records")
cursor.executemany("INSERT INTO pkg_ip_time VALUES (?, ?, ?)", stats)
connection.commit()
print("3. Generating statistics")
start = time.time()
stats_per_package = collections.defaultdict(dict)
for package, date, downloads in cursor.execute(COUNT_QUERY):
stats_per_package[package][date] = downloads
for package, stats in stats_per_package.items():
with open("melpa-{}-stats.json".format(package), mode="w") as stats_file:
json.dump(stats, stats_file)
print(time.time() - start)
main() I imagine the Let me know if there are other things I can look into to help :) |
Ping? I'd love to get such a feature. |
I'd like to be able to show charts of download activity over time for individual packages, which would involve dumping per-package json which includes timestamps (or at least dates). Do we have all the old logfiles in order to produce this? The sqlite aggregation would need to be changed, of course.
The text was updated successfully, but these errors were encountered: