{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":309258998,"defaultBranch":"main","name":"browsertrix-crawler","ownerLogin":"webrecorder","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2020-11-02T04:37:14.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/13686290?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1716587515.0","currentOid":""},"activityList":{"items":[{"before":"e8e1cdf8af1ab264c1deecaebe67e2af61cd2d96","after":null,"ref":"refs/heads/direct-fetch-optimize","pushedAt":"2024-05-24T21:51:55.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"}},{"before":"089d901b9b17af59dcb2bc58bad2ef55deeecc16","after":"a7d279cfbd3f6ca8375e564d1e0ef751af7e345f","ref":"refs/heads/main","pushedAt":"2024-05-24T21:51:51.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"},"commit":{"message":"Load non-HTML resources directly whenever possible (#583)\n\nOptimize the direct loading of non-HTML pages. Currently, the behavior\r\nis:\r\n- make a HEAD request first\r\n- make a direct fetch request only if HEAD request is a non-HTML and 200\r\n- only use fetch request if non-HTML and 200 and doesn't set any cookies\r\n\r\nThis changes the behavior to:\r\n- get cookies from browser for page URL\r\n- make a direct fetch request with cookies, if provided\r\n- only use fetch request if non-HTML and 200\r\nAlso:\r\n- ensures pageinfo is properly set with timestamp for direct fetch.\r\n- remove obsolete Agent handling that is no longer used in default\r\n(fetch)\r\n\r\nIf fetch request results in HTML, the response is aborted and browser\r\nloading is used.","shortMessageHtmlLink":"Load non-HTML resources directly whenever possible (#583)"}},{"before":"49573969ca1c3567d5dc371f4801dc5e34181dbc","after":"e8e1cdf8af1ab264c1deecaebe67e2af61cd2d96","ref":"refs/heads/direct-fetch-optimize","pushedAt":"2024-05-24T00:16:10.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"},"commit":{"message":"only handle non-redirects in direct fetch, as redirects records get serialized even if need to redo via browser","shortMessageHtmlLink":"only handle non-redirects in direct fetch, as redirects records get s…"}},{"before":null,"after":"38426f7068cef34e53379be42fbc5b3432698db6","ref":"refs/heads/refactor-headers","pushedAt":"2024-05-23T23:15:59.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"},"commit":{"message":"Merge branch 'main' into refactor-headers","shortMessageHtmlLink":"Merge branch 'main' into refactor-headers"}},{"before":"23b55eb9946ef45f0ce16cb5ba9383a99da10d86","after":"49573969ca1c3567d5dc371f4801dc5e34181dbc","ref":"refs/heads/direct-fetch-optimize","pushedAt":"2024-05-23T22:40:30.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"},"commit":{"message":"make manualRedirect opt configurable","shortMessageHtmlLink":"make manualRedirect opt configurable"}},{"before":null,"after":"23b55eb9946ef45f0ce16cb5ba9383a99da10d86","ref":"refs/heads/direct-fetch-optimize","pushedAt":"2024-05-23T17:55:55.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"},"commit":{"message":"direct fetch optimization:\n- drop initial HEAD check (and obsolete 'agent' params to fetch, no longer used)\n- load cookies for each page for direct fetch\n- attempt direct fetch GET request on every page with cookies + correct user-agent\n- abort direct fetch if response is HTML, and then load in browser, otherwise proceed with direct fetch\n- ensure direct fetch timestamp is set correctly, populated in pageinfo","shortMessageHtmlLink":"direct fetch optimization:"}},{"before":"ebe8b387473a4924e16ff86155288350be0aa596","after":null,"ref":"refs/heads/add-warc-info","pushedAt":"2024-05-22T22:47:09.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"}},{"before":"894681e5fcf31f4e8553f7f5d9cb57cde4b30bb9","after":"089d901b9b17af59dcb2bc58bad2ef55deeecc16","ref":"refs/heads/main","pushedAt":"2024-05-22T22:47:05.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"},"commit":{"message":"Always add warcinfo records to all WARCs (#556)\n\nFixes #553 \r\n\r\nIncludes `warcinfo` records at the beginning of new WARCs, as well as\r\nthe combined WARC.\r\nMakes the warcinfo record also WARC/1.1 to match the rest of the WARC\r\nrecords.","shortMessageHtmlLink":"Always add warcinfo records to all WARCs (#556)"}},{"before":"b17938ba79e070ee0bbf94f47b7522aab81cf4c0","after":null,"ref":"refs/heads/add-make-draft-release","pushedAt":"2024-05-22T22:45:52.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"}},{"before":"6c15bb3f007751cea29b90874f13bde38732d03e","after":"894681e5fcf31f4e8553f7f5d9cb57cde4b30bb9","ref":"refs/heads/main","pushedAt":"2024-05-22T22:45:48.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"},"commit":{"message":"Bump version to 1.2.0 Beta + make draft release for each commit (#582)\n\nGenerate draft release from main and *-release branches to simplify\r\nrelease process\r\n\r\n---------\r\nCo-authored-by: Tessa Walsh ","shortMessageHtmlLink":"Bump version to 1.2.0 Beta + make draft release for each commit (#582)"}},{"before":"6d27fa08b88aa700c42c5b100ee5174ce29a1112","after":"b17938ba79e070ee0bbf94f47b7522aab81cf4c0","ref":"refs/heads/add-make-draft-release","pushedAt":"2024-05-22T22:04:25.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"tw4l","name":"Tessa Walsh","path":"/tw4l","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6758804?s=80&v=4"},"commit":{"message":"Add pywb logo","shortMessageHtmlLink":"Add pywb logo"}},{"before":"de4039045350c1b9a4eaf50c4f7aac8ef217ddb3","after":"6d27fa08b88aa700c42c5b100ee5174ce29a1112","ref":"refs/heads/add-make-draft-release","pushedAt":"2024-05-22T21:51:01.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"tw4l","name":"Tessa Walsh","path":"/tw4l","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6758804?s=80&v=4"},"commit":{"message":"Remove errant Object from pageinfo resources","shortMessageHtmlLink":"Remove errant Object from pageinfo resources"}},{"before":"6a68d4f13417291ac3bcb854f6b7e17843b65775","after":"de4039045350c1b9a4eaf50c4f7aac8ef217ddb3","ref":"refs/heads/add-make-draft-release","pushedAt":"2024-05-22T21:48:08.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"tw4l","name":"Tessa Walsh","path":"/tw4l","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6758804?s=80&v=4"},"commit":{"message":"Update pageinfo to match updated webrecorder.net","shortMessageHtmlLink":"Update pageinfo to match updated webrecorder.net"}},{"before":"994987eb5389af94c8804cd3430ed2147a395cd8","after":"6a68d4f13417291ac3bcb854f6b7e17843b65775","ref":"refs/heads/add-make-draft-release","pushedAt":"2024-05-22T17:50:09.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"},"commit":{"message":"readd branches filter","shortMessageHtmlLink":"readd branches filter"}},{"before":"e539cb8f070ee0ba9be2c2cc148f7ba71564f82c","after":"994987eb5389af94c8804cd3430ed2147a395cd8","ref":"refs/heads/add-make-draft-release","pushedAt":"2024-05-22T17:48:35.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"},"commit":{"message":"more tweaks","shortMessageHtmlLink":"more tweaks"}},{"before":"bedcfd120b8086cf232af8132ce5c49d73c6b00c","after":"e539cb8f070ee0ba9be2c2cc148f7ba71564f82c","ref":"refs/heads/add-make-draft-release","pushedAt":"2024-05-22T17:45:40.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"},"commit":{"message":"fix","shortMessageHtmlLink":"fix"}},{"before":"8ddba8c9687b7d183f15e555fee1eb5d633edeb2","after":"bedcfd120b8086cf232af8132ce5c49d73c6b00c","ref":"refs/heads/add-make-draft-release","pushedAt":"2024-05-22T17:43:58.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"},"commit":{"message":"fix","shortMessageHtmlLink":"fix"}},{"before":null,"after":"8ddba8c9687b7d183f15e555fee1eb5d633edeb2","ref":"refs/heads/add-make-draft-release","pushedAt":"2024-05-22T17:34:59.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"},"commit":{"message":"ci: add generating draft release for current version","shortMessageHtmlLink":"ci: add generating draft release for current version"}},{"before":"34fcbc9e682eab5ad4b8c10c77febe8f357d7155","after":null,"ref":"refs/heads/no-sandbox-only-if-root","pushedAt":"2024-05-22T17:25:44.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"}},{"before":"2292c9cfc21baba72e399a9c20a7181296fa157a","after":null,"ref":"refs/heads/bump-base-image-1.64.122","pushedAt":"2024-05-22T17:25:38.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"}},{"before":"1fcd3b7d6b2a440ad871219b97ba841365d96917","after":"6c15bb3f007751cea29b90874f13bde38732d03e","ref":"refs/heads/main","pushedAt":"2024-05-21T23:37:17.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"},"commit":{"message":"version: bump to 1.1.3","shortMessageHtmlLink":"version: bump to 1.1.3"}},{"before":"ad3b8fbc962633e4019310cbb3933a199a60e6b5","after":null,"ref":"refs/heads/issue-575-fail-failed-limit","pushedAt":"2024-05-21T23:35:46.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"}},{"before":"27226255eef038f206c1f8644ee810773246c733","after":"1fcd3b7d6b2a440ad871219b97ba841365d96917","ref":"refs/heads/main","pushedAt":"2024-05-21T23:35:43.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"},"commit":{"message":"Fix failOnFailedLimit and add tests (#580)\n\nFixes #575\r\n\r\n- Adds a missing await to fetching the number of failed pages from Redis\r\n- Fixes a typo in the fatal logging message\r\n- Adds a test to ensure that the crawl fails with exit code 17 if\r\n--failOnInvalidStatus and --failOnFailedLimit 1 are set with a url that\r\nwill 404","shortMessageHtmlLink":"Fix failOnFailedLimit and add tests (#580)"}},{"before":"31c8d78ff4921c8da1cbdc9517acd2bb7d183996","after":null,"ref":"refs/heads/sitemap-parse-fix","pushedAt":"2024-05-21T21:24:20.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"}},{"before":"6b04a39f2f781a4448201941e8c792f1308364fe","after":"27226255eef038f206c1f8644ee810773246c733","ref":"refs/heads/main","pushedAt":"2024-05-21T21:24:17.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"},"commit":{"message":"Sitemap Parsing Fixes (#578)\n\nAdditional fixes for sitemaps:\r\n- Fix parsing sitemaps that have data wrapped in CDATA fields, fixes\r\npart of https://github.com/webrecorder/browsertrix/issues/1750\r\n- Fix parsing where the .gz sitemap have content-encoding and are\r\nactually not gzipped\r\n- Ensure error in gzip parsing doesn't break crawl, just errors sitemap\r\nparsing.","shortMessageHtmlLink":"Sitemap Parsing Fixes (#578)"}},{"before":"8101958f37bc9e996ad9c221c9eab6d3e8c9aa10","after":"ad3b8fbc962633e4019310cbb3933a199a60e6b5","ref":"refs/heads/issue-575-fail-failed-limit","pushedAt":"2024-05-21T19:24:24.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"tw4l","name":"Tessa Walsh","path":"/tw4l","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6758804?s=80&v=4"},"commit":{"message":"Fix failOnFailedLimit and add tests\n\nAdds a missing await to fetching the number of failed pages from\nRedis, fixes a typo in the fatal log, and adds a test to ensure\nthat the crawl fails with exit code 17 if --failOnInvalidStatus\nand --failOnFailedLimit 1 are set with a url that will 404.","shortMessageHtmlLink":"Fix failOnFailedLimit and add tests"}},{"before":"c153b133dca499b21917bee12112e2f45fbc21b7","after":"31c8d78ff4921c8da1cbdc9517acd2bb7d183996","ref":"refs/heads/sitemap-parse-fix","pushedAt":"2024-05-21T18:05:18.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"},"commit":{"message":"also check for .xml.gz extension","shortMessageHtmlLink":"also check for .xml.gz extension"}},{"before":"4e6a4686e1d0224ac9f12688e41100713668e82e","after":"8101958f37bc9e996ad9c221c9eab6d3e8c9aa10","ref":"refs/heads/issue-575-fail-failed-limit","pushedAt":"2024-05-21T18:03:23.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"tw4l","name":"Tessa Walsh","path":"/tw4l","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6758804?s=80&v=4"},"commit":{"message":"Come on now","shortMessageHtmlLink":"Come on now"}},{"before":"642a727a59227f49daf3d2dd64a0fdb765c30a50","after":"0d66ec2afcbf9fabbd71d4ca67980caa3566d375","ref":"refs/heads/gh-pages","pushedAt":"2024-05-21T17:59:51.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Deployed 2ef116d with MkDocs version: 1.6.0","shortMessageHtmlLink":"Deployed 2ef116d with MkDocs version: 1.6.0"}},{"before":"f8746a5f840bace1ab0bfcc55d7b471df5878341","after":null,"ref":"refs/heads/fix-save-state-pending","pushedAt":"2024-05-21T17:58:39.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"ikreymer","name":"Ilya Kreymer","path":"/ikreymer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1015759?s=80&v=4"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEU212JAA","startCursor":null,"endCursor":null}},"title":"Activity · webrecorder/browsertrix-crawler"}