Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import fails with "unexpected EOF" #602

Closed
SIMULATAN opened this issue Feb 1, 2024 · 6 comments
Closed

Import fails with "unexpected EOF" #602

SIMULATAN opened this issue Feb 1, 2024 · 6 comments

Comments

@SIMULATAN
Copy link

Describe the bug
On a clean instance with a new (admin) user, importing fails with the message fetching data dump for user 'SIMULATAN' failed - Get "https://api.wakatime.com/api/v1/users/current/data_dumps": unexpected EOF. Furthermore, downloaded 0 heartbeats for user 'SIMULATAN' (0 actually imported) is printed to the console after said error.
logs

After manually retrying the import a few times (which, btw, involved deleting the last import keys from the DB as it counted it as a successful import!), I managed to get it to print "fetch xxx bytes" to the log, but that was the only difference to said behavior.

Using the legacy importer only downloads the latest heartbeats (~5000 ^= from the 18th to the 29st (when the dump was created)) and those don't match by a long shot (38h vs 54h).

NOTE: the dump file has a size of 130MB and contains ~196 000 heartbeats

System information

  • Wakapi version: 2.4.10
  • Environments
    • Docker @ my Laptop (not at home)
    • Kubernetes @ Homelab (entirely different network)
  • Image: ghcr.io/muety/wakapi:2.10 (ghcr.io/muety/wakapi@sha256:76c78e75d01f5f95f3856b4d1baf308ca26cc2000d4f70c078a9471ddb2e7401) - downloaded just yesterday
  • Database: SQLite & PostgreSQL

This problem already occurred a good few weeks ago when I used SQLite in a minimal testing setup. I didn't think much of it and waited until now to retry to properly host wakapi. Now, it's coming up again.

lastly: thanks for your awesome work on wakapi! really looking forward to using it

@muety
Copy link
Owner

muety commented Feb 1, 2024

There were issues downloading the actual dump file (see #542) in the past, but I never saw this error when requesting the list of dumps from WakaTime. Strange. Can you please do me a favor and manually run that request (e.g. using curl or your browser) and report the result of that? See #506 (comment).

Using the legacy importer only downloads the latest heartbeats [...]

It downloads the range between the date of the latest heartbeat in your database and now.

[...] and those don't match by a long shot (38h vs 54h)

This seems a bit much, but generally, discrepancies between Wakapi's and WakaTime's reported total time are common, as explained in #587, #488, #334, ...

[...] which, btw, involved deleting the last import keys from the DB as it counted it as a successful import!)

Yep, there is a back-off mechanism that only allows imports every WAKAPI_IMPORT_MAX_RATE hours.

@SIMULATAN
Copy link
Author

Thanks for the quick response!

Can you please do me a favor and manually run that request (e.g. using curl or your browser)

Works just fine every time.
image
I managed to successfully download the dump using a browser on the first attempt, hence the report of the heartbeats. Using wget works just fine too, as well as curl.

I went ahead and tried to do the curl call in the docker container too, which led to the same positive result as locally.

It downloads the range between the date of the latest heartbeat in your database and now.

I see, that explains it. Thanks for clearing that up.

Yep, there is a back-off mechanism that only allows imports every WAKAPI_IMPORT_MAX_RATE hours.

I'm aware, my main point here was that it saved the failure as a successful import, bumping the rate to the successful import default of 24h rather than the backoff one of 5 minutes.

@muety muety closed this as completed in 2161c88 Feb 2, 2024
@muety
Copy link
Owner

muety commented Feb 2, 2024

Btw. there are WAKAPI_IMPORT_BACKOFF_MIN and WAKAPI_IMPORT_MAX_RATE. The first one controls how often imports can be attempted generally, while the second one essentially determines how often data can actually be imported, i.e. how often a successful import can happen.

@muety
Copy link
Owner

muety commented Feb 2, 2024

@alanhamlett Just being curios, were there any recent changes on WakaTime's end that might be related to

// workaround for https://github.com/muety/wakapi/issues/602
// super weird behavior:
// when keeping req.Close set to false (keep connection alive), we'll get an "unexpected EOF" error inside checkDumpAvailable(),
// even though i can neither reproduce this error with curl, nor in a minimal, stand-alone go example (https://go.dev/play/p/HY_RLtTWnkk works totally fine)
// even weirder: even when creating a whole new http.Client for every request, the issue keeps occuring
// this used to be working in the past and suddenly broke at some point (change on wakatime's end?)
req.Close = true
?

@SIMULATAN
Copy link
Author

SIMULATAN commented Feb 2, 2024

Now that was a quick fix 🔥 thanks!

The good news: the EOF error is gone, it successfully downloaded ~195 000 heartbeats which aligns with my manual JQ counting.

The bad news:

  1. it failed to create tons of summaries (see log below)
  2. the total time is off by a long shot (15h vs 21h this week, 1300h vs 1600h all time)
    • the keystroke timeout set on wakatime is 15 minutes
    • not sure if this is still within the discrepancy range caused by the different algorithms
      • considering the long timeout, probably yes?
Logs
2024-02-02T15:36:24.559729361Z [INFO ] data dump for user 'SIMULATAN' is available for download
2024-02-02T15:36:27.300464454Z [INFO ] fetched 136088063 bytes data dump for user 'SIMULATAN'
2024-02-02T15:41:48.635676407Z [INFO ] downloaded 196001 heartbeats for user 'SIMULATAN' (195889 actually imported)
2024-02-02T15:41:48.635693309Z [INFO ] clearing summaries for user 'SIMULATAN'
2024-02-02T15:41:48.637229407Z [INFO ] generating summaries
2024-02-02T15:41:48.855552359Z [INFO ] successfully generated summary (2021-09-21 00:00:00 +0000 UTC, 2021-09-22 00:00:00 +0000 UTC, SIMULATAN)
2024-02-02T15:41:48.856504301Z [ERROR] failed to save summary (SIMULATAN, 2021-09-21 05:53:48.62 +0000 UTC, 2021-09-21 13:20:47.381 +0000 UTC) - unsupported data type: [0xc00af7a240 0xc00af04d40 0xc00af04e00 0xc00af04e40 0xc00af04dc0 0xc00af04ec0 0xc00af04e80]: Table not set, please set it like: db.Model(&user) or db.Table("users")
2024-02-02T15:41:48.862255555Z [INFO ] noop mail service doing nothing instead of sending password reset mail to [[my@email.com]]
2024-02-02T15:41:48.862294456Z [INFO ] sent import notification mail to SIMULATAN
2024-02-02T15:41:48.869484868Z [INFO ] successfully generated summary (2021-10-31 00:00:00 +0000 UTC, 2021-11-01 00:00:00 +0000 UTC, SIMULATAN)
2024-02-02T15:41:48.869988287Z [ERROR] failed to save summary (SIMULATAN, 2021-10-31 11:23:28.746 +0000 UTC, 2021-10-31 14:31:43.186 +0000 UTC) - unsupported data type: [0xc00b0659e0 0xc00afabd80 0xc00afabf40 0xc00afabec0 0xc00afabf00 0xc00afabf80 0xc00afabe80 0xc00af05180]: Table not set, please set it like: db.Model(&user) or db.Table("users")
2024-02-02T15:41:48.879226875Z [INFO ] successfully generated summary (2021-09-22 00:00:00 +0000 UTC, 2021-09-23 00:00:00 +0000 UTC, SIMULATAN)
2024-02-02T15:41:48.879319695Z [ERROR] failed to save summary (SIMULATAN, 2021-09-22 06:29:10.948 +0000 UTC, 2021-09-22 13:56:35.292 +0000 UTC) - unsupported data type: [0xc00af7a900 0xc00b2de380 0xc00b097900 0xc00b097940 0xc00b0978c0 0xc00b2de400 0xc00b2de3c0]: Table not set, please set it like: db.Model(&user) or db.Table("users")
2024-02-02T15:41:48.914696186Z [ERROR] failed to save summary (SIMULATAN, 2021-09-23 06:16:59.842 +0000 UTC, 2021-09-23 09:50:34.431 +0000 UTC) - unsupported data type: [0xc00af7afc0 0xc00af055c0 0xc00adace40 0xc00adace80 0xc00adacec0 0xc00af05640 0xc00af05600]: Table not set, please set it like: db.Model(&user) or db.Table("users")
2024-02-02T15:41:48.914638916Z [INFO ] successfully generated summary (2021-09-23 00:00:00 +0000 UTC, 2021-09-24 00:00:00 +0000 UTC, SIMULATAN)
2024-02-02T15:41:48.926604962Z [INFO ] successfully generated summary (2021-09-24 00:00:00 +0000 UTC, 2021-09-25 00:00:00 +0000 UTC, SIMULATAN)
2024-02-02T15:41:48.926817351Z [ERROR] failed to save summary (SIMULATAN, 2021-09-24 10:35:32.449 +0000 UTC, 2021-09-24 20:31:48.87 +0000 UTC) - unsupported data type: [0xc00aed3d40 0xc00b097b80 0xc00b097bc0 0xc00adacfc0 0xc00adacf00 0xc00adacf80 0xc00adacf40 0xc00b097c80 0xc00b097c40 0xc00b097c00]: Table not set, please set it like: db.Model(&user) or db.Table("users")
2024-02-02T15:41:48.928200007Z [INFO ] successfully generated summary (2021-09-25 00:00:00 +0000 UTC, 2021-09-26 00:00:00 +0000 UTC, SIMULATAN)
2024-02-02T15:41:48.928281861Z [ERROR] failed to save summary (SIMULATAN, 2021-09-25 10:14:04.56 +0000 UTC, 2021-09-25 10:14:04.56 +0000 UTC) - unsupported data type: [0xc00acfed80 0xc00b097fc0 0xc00adad000 0xc00b6d8040 0xc00b6d8000]: Table not set, please set it like: db.Model(&user) or db.Table("users")
2024-02-02T15:41:48.930429982Z [INFO ] successfully generated summary (2021-09-26 00:00:00 +0000 UTC, 2021-09-27 00:00:00 +0000 UTC, SIMULATAN)
2024-02-02T15:41:48.930507297Z [ERROR] failed to save summary (SIMULATAN, 2021-09-26 19:39:50.519 +0000 UTC, 2021-09-26 20:03:57.204 +0000 UTC) - unsupported data type: [0xc00b6d4480 0xc00adad3c0 0xc00b6d80c0 0xc00b6d8100 0xc00adad440 0xc00adad400]: Table not set, please set it like: db.Model(&user) or db.Table("users")

(this goes on for quite a while)

Some screenshots

wakapi activity chart
wakapi activity cards
wakatime total time

Heartbeats of failing days

2021-09-27
2021-10-31

Still, it's pretty usable as-is, as such, I'm happy with the current state.

@alanhamlett
Copy link
Contributor

alanhamlett commented Feb 2, 2024

@alanhamlett Just being curios, were there any recent changes on WakaTime's end that might be related to?

No, the exports/dumps are downloaded directly from S3 so nothing to do with WakaTime servers.

If the EOF is coming from /api/v1/users/current/data_dumps then that endpoint hasn't changed in many years.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants