Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't evaluate Select #181

Closed
TheTechRobo opened this issue May 5, 2021 · 4 comments
Closed

Can't evaluate Select #181

TheTechRobo opened this issue May 5, 2021 · 4 comments

Comments

@TheTechRobo
Copy link
Contributor

TheTechRobo commented May 5, 2021

I am trying to back up a site.

Here is what I run and the output.

$ grab-site --concurrency 1 --delay 1000 https://www.khanacademy.org/test-prep/mcat
psutil: No module named 'psutil'. Resource monitoring will be unavailable.
Manhole[16762:1620245077.6276]: Patched <built-in function fork> and <built-in function forkpty>.
Manhole[16762:1620245077.6541]: Manhole UDS path: /tmp/manhole-16762
Manhole[16762:1620245077.6544]: Waiting for new connection (in pid:16762) ...
Created lmdb db with map_size=1099511627776
Imported /home/thetechrobo/grab-site/www.khanacademy.org-test-prep-mcat-2021-05-05-37e0452b/igsets
Using these 191 ignores:
        %25252525
        /%22%20\+[^/]+\+%20%22
        /%22\+[^/]+\+%22
        /%27%20\+[^/]+\+%20%27
        /%27\+[^/]+\+%27
        /%5C/%5C/
        /'\+[^/]+\+'
        /(%5C)+(%22|%27)
        /App_Themes/.+/App_Themes/
        /\\+(%22|%27)
        /\\+["']
        /\\/\\/
        /bxSlider/.+/bxSlider/
        /bxSlider/bxSlider/
        /clientscript/.+/clientscript/clientscript/
        /clientscript/clientscript/.+/clientscript/
        /clientscript/clientscript/clientscript/
        /css/.+/css/css/
        /css/css/.+/css/
        /css/css/css/
        /images/.+/images/images/
        /images/images/.+/images/
        /images/images/images/
        /img/.+/img/img/
        /img/img/.+/img/
        /img/img/img/
        /js/.+/js/js/
        /js/js/.+/js/
        /js/js/js/
        /lib/exe/.*lib[-_]exe[-_]lib[-_]exe[-_]
        /scripts/.+/scripts/scripts/
        /scripts/scripts/.+/scripts/
        /scripts/scripts/scripts/
        /slides/.+/slides/slides/
        /slides/slides/.+/slides/
        /slides/slides/slides/
        /styles/.+/styles/styles/
        /styles/styles/.+/styles/
        /styles/styles/styles/
        ^https?://((s-)?static\.ak\.fbcdn\.net|(connect\.|www\.)?facebook\.com)/connect\.php/js/.*rsrc\.php
        ^https?://([^/]+\.)?gdcvault\.com(/.*/|/)(fonts(/.*/|/)fonts/|css(/.*/|/)css/|img(/.*/|/)img/)
        ^https?://([^\./]+\.)?stream\.publicradio\.org/
        ^https?://([^\.]+\.)?pinterest\.com/pin/create/
        ^https?://(\d|www|secure)\.gravatar\.com/avatar/ad516503a11cd5ca435acc9bb6523536
        ^https?://(apis|plusone)\.google\.com/_/\+1/
        ^https?://(audio\d?|nfw)\.video\.ria\.ru/
        ^https?://(ssl\.|www\.)?reddit\.com/(login\?dest=|submit\?|static/button/button)
        ^https?://(www\.)?(megaupload|filesonic|wupload)\.com/
        ^https?://(www\.)?digg\.com/submit\?
        ^https?://(www\.)?facebook\.com/(plugins/(share_button|like(box)?)\.php|sharer/sharer\.php|sharer?\.php|dialog/(feed|share))\?
        ^https?://(www\.)?facebook\.com/v[\d\.]+/plugins/like\.php
        ^https?://(www\.)?friendfeed\.com/share\?
        ^https?://(www\.)?instapaper\.com/hello2\?
        ^https?://(www\.)?myspace\.com/Modules/PostTo/
        ^https?://(www\.)?stumbleupon\.com/(submit\?|badge/embed/)
        ^https?://(www\.)?technorati\.com/faves/?\?add=
        ^https?://(www\.)?twitter\.com/(share\?|intent/((re)?tweet|favorite)|home/?\?status=|\?status=)
        ^https?://(www\.)?xing\.com/(app/user\?op=share|social_plugins/share\?)
        ^https?://(www|draft)\.blogger\.com/(navbar\.g|post-edit\.g|delete-comment\.g|comment-iframe\.g|share-post\.g|email-post\.g|blog-this\.g|delete-backlink\.g|rearrange|blog_this\.pyra)\?
        ^https?://(www|px\.srvcs)\.tumblr\.com/(impixu\?|share(/link/?)?\?|reblog/)
        ^https?://(www|ssl)\.google-analytics\.com/(r/)?(__utm\.gif|collect\?)
        ^https?://.+/.+/disqus\.com/forums/$
        ^https?://.+/js-agent\.newrelic\.com/nr-\d{3}(\.min)?\.js$
        ^https?://.+/js/chartbeat\.js$
        ^https?://.+/stats\.g\.doubleclick\.net/dc\.js$
        ^https?://.+\.blogspot\.(com|in|com\.au|co\.uk|jp|co\.nz|ca|de|it|fr|se|sg|es|pt|com\.br|ar|mx|kr)/(\d{4}/\d{2}/|search/label/)(CSI/$|.*/CSI/CSI/CSI/)
        ^https?://[^/]*musicproxy\.s12\.de/
        ^https?://[^/]+/.+/CaptchaImage\.axd
        ^https?://[^/]+/anony/mjpg\.cgi$
        ^https?://[^/]+/mjpg/video\.mjpg
        ^https?://[^/]+\.akadostream\.ru(:\d+)?/
        ^https?://[^/]+\.corp\.ne1\.yahoo\.com/
        ^https?://[^/]+\.facebook\.com/login\.php
        ^https?://[^/]+\.gaduradio\.pl/
        ^https?://[^/]+\.libsyn\.com/.+/%2[02]https?:/
        ^https?://[^/]+\.rastream\.com(:\d+)?/
        ^https?://[^/]+\.services\.livejournal\.com/ljcounter
        ^https?://[^/]+\.streamtheworld\.com/
        ^https?://[^/]+\.xiti\.com/hit\.xiti\?
        ^https?://[^\./]+\.radioscoop\.(com|net):\d+/
        ^https?://[^\./]+\.streamchan\.org:\d+/
        ^https?://[^\.]+\.livejournal\.com/.+/\*sup_ru/ru/UTF-8/
        ^https?://[^\.]+\.livejournal\.com/.+http://[^\.]+\.livejournal\.com/
        ^https?://[a-z0-9]+\.cdn\.dvmr\.fr(:\d+)?/.+\.mp3
        ^https?://\d+\.media\.tumblr\.com/avatar_.+_16\.pn[gj]$
        ^https?://accounts\.google\.com/(SignUp|ServiceLogin|AccountChooser|a/UniversalLogin)
        ^https?://add\.my\.yahoo\.com/(rss|content)\?
        ^https?://air\.radiorecord\.ru(:\d+)?/
        ^https?://alb\.reddit\.com/
        ^https?://api\.addthis\.com/
        ^https?://audio\d?\.radioreference\.com/
        ^https?://audiots\.scdn\.arkena\.com/
        ^https?://av\.rasset\.ie/av/live/
        ^https?://b\.hatena\.ne\.jp/add\?
        ^https?://b\.scorecardresearch\.com/
        ^https?://beacon\.wikia-services\.com/
        ^https?://bookmark\.naver\.com/post\?
        ^https?://bufferapp\.com/add\?
        ^https?://connect\.mail\.ru/share\?
        ^https?://csp\.cyworld\.com/bi/bi_recommend_pop\.php\?
        ^https?://del\.icio\.us/post\?
        ^https?://delicious\.com/(save|post)\?
        ^https?://download\.ted\.com/
        ^https?://flattr.com/submit/auto\?
        ^https?://gcnplayer\.gcnlive\.com/.+
        ^https?://geo\.yahoo\.com/b\?
        ^https?://getpocket\.com/(save|edit)/?\?
        ^https?://i\.dev\.cdn\.turner\.com/
        ^https?://imageshack\.com/lost$
        ^https?://iwiw\.hu/pages/share/share\.jsp\?
        ^https?://mail\.google\.com/mail/
        ^https?://media\.opb\.org/clips/embed/.+\.js$
        ^https?://medium\.com/_/(vote|bookmark|subscribe)/
        ^https?://memori(\.qip)?\.ru/link/\?
        ^https?://mp3\.ffh\.de/
        ^https?://mp3tslg\.tdf-cdn\.com/
        ^https?://myweb2\.search\.yahoo\.com/myresults/bookmarklet\?
        ^https?://news\.ycombinator\.com/submitlink\?
        ^https?://p\.opt\.fimserve\.com/
        ^https?://photobucket\.com/.+/albums/.+/albums/
        ^https?://pixel\.(quantserve|wp)\.com/
        ^https?://pixel\.blog\.hu/
        ^https?://pixel\.redditmedia\.com/pixel/
        ^https?://platform\d?\.twitter\.com/widgets/tweet_button.html\?
        ^https?://play(\d+)?\.radio13\.ru:8000/
        ^https?://plus\.google\.com/share\?
        ^https?://posterous\.com/share\?
        ^https?://prod-preview\.wired\.com/
        ^https?://pub(\d+)?\.di\.fm/
        ^https?://r-a-d\.io/.+\.mp3$
        ^https?://r-login\.wordpress\.com/remote-login\.php
        ^https?://relay\.broadcastify\.com/
        ^https?://reporter\.es\.msn\.com/\?fn=contribute
        ^https?://s\d+\.sitemeter\.com/(js/counter\.js|meter\.asp)
        ^https?://service\.weibo\.com/share/share\.php\?
        ^https?://share\.flipboard\.com/bookmarklet/popout\?
        ^https?://social-plugins\.line\.me/lineit/share
        ^https?://sphinn\.com/index\.php\?c=post&m=submit&
        ^https?://static\.licdn\.com/sc/p/.+/f//
        ^https?://static\.licdn\.com/sc/p/com\.linkedin\.nux(:|%3A)nux-static-content(\+|%2B)[\d\.]+/f/
        ^https?://stream(\d+)?\.media\.rambler\.ru/
        ^https?://telegram\.me/share/url\?
        ^https?://tm\.uol\.com\.br/h/.+/h/
        ^https?://tmz\.vo\.llnwd\.net/
        ^https?://upload\.wikimedia\.org/wikipedia/[^/]+/thumb/
        ^https?://video-subtitle\.tedcdn\.com/
        ^https?://vkontakte\.ru/share\.php\?
        ^https?://vuible\.com/pins-settings/
        ^https?://web\.archive\.org/web/[^/]+/https?\:/[^/]+\.addthis\.com/.+/static/.+/static/
        ^https?://wow\.ya\.ru/posts_(add|share)_link\.xml\?
        ^https?://www\.addthis\.com/bookmark\.php\?
        ^https?://www\.addtoany\.com/(add_to/|share_save\?)
        ^https?://www\.amazon\.com/.+/logging/log-action\.html
        ^https?://www\.blinklist\.com/index\.php\?Action=Blink/addblink\.php
        ^https?://www\.blogger\.com/feeds/\d+/\d+/comments/default/\d+
        ^https?://www\.blogger\.com/feeds/\d+/posts/default/\d+
        ^https?://www\.dreamwidth\.org/tools/(memadd|tellafriend)\?
        ^https?://www\.flickr\.com/(explore/|photos/[^/]+/(sets/\d+/(page\d+/)?)?)\d+_[a-f0-9]+(_[a-z])?\.jpg$
        ^https?://www\.flickr\.com/change_language\.gne
        ^https?://www\.google\.com/(reader/link\?|buzz/post\?)
        ^https?://www\.google\.com/accounts/AccountChooser
        ^https?://www\.google\.com/bookmarks/mark\?
        ^https?://www\.google\.com/recaptcha/(api|mailhide/d\?)
        ^https?://www\.infomous\.com/cloud_widget/lib/lib/
        ^https?://www\.khaleejtimes\.com/.+/images/.+/images/
        ^https?://www\.khaleejtimes\.com/.+/imgactv/.+/imgactv/
        ^https?://www\.khaleejtimes\.com/.+/kt_.+/kt_
        ^https?://www\.khanacademy\.org(/.*|/)page/%d/$
        ^https?://www\.khanacademy\.org/(wp-admin/|wp-login\.php\?)
        ^https?://www\.khanacademy\.org/.*%5Cx26route=/archive
        ^https?://www\.khanacademy\.org/.*&amp;amp;amp;
        ^https?://www\.khanacademy\.org/.*(\?|%5Cx26)route=(/page/:page|/archive/:year/:month|/tagged/:tag|/post/:id|/image/:post_id)
        ^https?://www\.khanacademy\.org/.*amp%3Bamp%3Bamp%3B
        ^https?://www\.khanacademy\.org/.+/%3Ca%20href=
        ^https?://www\.khanacademy\.org/.+/jetpack-comment/\?blogid=\d+&postid=\d+
        ^https?://www\.khanacademy\.org/.+/plugins/ultimate-social-media-plus/.+/like/like/
        ^https?://www\.khanacademy\.org/.+/quote-comment-\d+/$
        ^https?://www\.khanacademy\.org/.+[\?&](replyto(com)?|like_comment)=\d+
        ^https?://www\.khanacademy\.org/.+[\?&]mode=reply
        ^https?://www\.khanacademy\.org/.+[\?&]share=[a-z]{4,}
        ^https?://www\.khanacademy\.org/.+\?showComment(=|%5C)\d+
        ^https?://www\.khanacademy\.org/search(/label/[^\?]+|\?q=[^&]+|)[\?&]updated-(min|max)=\d{4}-\d\d-\d\dT\d\d:\d\d:\d\d.*&max-results=\d+
        ^https?://www\.linkedin\.com/(cws/share|shareArticle)\?
        ^https?://www\.livejournal\.com/(tools/memadd|update|(identity/)?login)\.bml\?
        ^https?://www\.netvibes\.com/subscribe\.php\?
        ^https?://www\.newsvine\.com/_wine/save\?
        ^https?://www\.odnoklassniki\.ru/dk\?st\.cmd=addShare
        ^https?://www\.warnerbros\.com/\d+$
        ^https?://www\.youtube\.com/.*\[\[.+\]\]
        ^https?://www\.youtube\.com/.*\{\{.+\}\}
        ^https?://zakladki\.yandex\.ru/newlink\.xml\?
/home/thetechrobo/gs-venv/lib/python3.7/site-packages/sqlalchemy/sql/coercions.py:308: SAWarning: implicitly coercing SELECT object to scalar subquery; please use the .scalar_subquery() method to produce a scalar subquery.
  "implicitly coercing SELECT object to scalar subquery; "
Disconnected from ws:// server: ConnectionRefusedError(111, "Connect call failed ('127.0.0.1', 29000)")
Imported /home/thetechrobo/grab-site/www.khanacademy.org-test-prep-mcat-2021-05-05-37e0452b/ignores
Using these 191 ignores:
        %25252525
        /%22%20\+[^/]+\+%20%22
        /%22\+[^/]+\+%22
        /%27%20\+[^/]+\+%20%27
        /%27\+[^/]+\+%27
        /%5C/%5C/
        /'\+[^/]+\+'
        /(%5C)+(%22|%27)
        /App_Themes/.+/App_Themes/
        /\\+(%22|%27)
        /\\+["']
        /\\/\\/
        /bxSlider/.+/bxSlider/
        /bxSlider/bxSlider/
        /clientscript/.+/clientscript/clientscript/
        /clientscript/clientscript/.+/clientscript/
        /clientscript/clientscript/clientscript/
        /css/.+/css/css/
        /css/css/.+/css/
        /css/css/css/
        /images/.+/images/images/
        /images/images/.+/images/
        /images/images/images/
        /img/.+/img/img/
        /img/img/.+/img/
        /img/img/img/
        /js/.+/js/js/
        /js/js/.+/js/
        /js/js/js/
        /lib/exe/.*lib[-_]exe[-_]lib[-_]exe[-_]
        /scripts/.+/scripts/scripts/
        /scripts/scripts/.+/scripts/
        /scripts/scripts/scripts/
        /slides/.+/slides/slides/
        /slides/slides/.+/slides/
        /slides/slides/slides/
        /styles/.+/styles/styles/
        /styles/styles/.+/styles/
        /styles/styles/styles/
        ^https?://((s-)?static\.ak\.fbcdn\.net|(connect\.|www\.)?facebook\.com)/connect\.php/js/.*rsrc\.php
        ^https?://([^/]+\.)?gdcvault\.com(/.*/|/)(fonts(/.*/|/)fonts/|css(/.*/|/)css/|img(/.*/|/)img/)
        ^https?://([^\./]+\.)?stream\.publicradio\.org/
        ^https?://([^\.]+\.)?pinterest\.com/pin/create/
        ^https?://(\d|www|secure)\.gravatar\.com/avatar/ad516503a11cd5ca435acc9bb6523536
        ^https?://(apis|plusone)\.google\.com/_/\+1/
        ^https?://(audio\d?|nfw)\.video\.ria\.ru/
        ^https?://(ssl\.|www\.)?reddit\.com/(login\?dest=|submit\?|static/button/button)
        ^https?://(www\.)?(megaupload|filesonic|wupload)\.com/
        ^https?://(www\.)?digg\.com/submit\?
        ^https?://(www\.)?facebook\.com/(plugins/(share_button|like(box)?)\.php|sharer/sharer\.php|sharer?\.php|dialog/(feed|share))\?
        ^https?://(www\.)?facebook\.com/v[\d\.]+/plugins/like\.php
        ^https?://(www\.)?friendfeed\.com/share\?
        ^https?://(www\.)?instapaper\.com/hello2\?
        ^https?://(www\.)?myspace\.com/Modules/PostTo/
        ^https?://(www\.)?stumbleupon\.com/(submit\?|badge/embed/)
        ^https?://(www\.)?technorati\.com/faves/?\?add=
        ^https?://(www\.)?twitter\.com/(share\?|intent/((re)?tweet|favorite)|home/?\?status=|\?status=)
        ^https?://(www\.)?xing\.com/(app/user\?op=share|social_plugins/share\?)
        ^https?://(www|draft)\.blogger\.com/(navbar\.g|post-edit\.g|delete-comment\.g|comment-iframe\.g|share-post\.g|email-post\.g|blog-this\.g|delete-backlink\.g|rearrange|blog_this\.pyra)\?
        ^https?://(www|px\.srvcs)\.tumblr\.com/(impixu\?|share(/link/?)?\?|reblog/)
        ^https?://(www|ssl)\.google-analytics\.com/(r/)?(__utm\.gif|collect\?)
        ^https?://.+/.+/disqus\.com/forums/$
        ^https?://.+/js-agent\.newrelic\.com/nr-\d{3}(\.min)?\.js$
        ^https?://.+/js/chartbeat\.js$
        ^https?://.+/stats\.g\.doubleclick\.net/dc\.js$
        ^https?://.+\.blogspot\.(com|in|com\.au|co\.uk|jp|co\.nz|ca|de|it|fr|se|sg|es|pt|com\.br|ar|mx|kr)/(\d{4}/\d{2}/|search/label/)(CSI/$|.*/CSI/CSI/CSI/)
        ^https?://[^/]*musicproxy\.s12\.de/
        ^https?://[^/]+/.+/CaptchaImage\.axd
        ^https?://[^/]+/anony/mjpg\.cgi$
        ^https?://[^/]+/mjpg/video\.mjpg
        ^https?://[^/]+\.akadostream\.ru(:\d+)?/
        ^https?://[^/]+\.corp\.ne1\.yahoo\.com/
        ^https?://[^/]+\.facebook\.com/login\.php
        ^https?://[^/]+\.gaduradio\.pl/
        ^https?://[^/]+\.libsyn\.com/.+/%2[02]https?:/
        ^https?://[^/]+\.rastream\.com(:\d+)?/
        ^https?://[^/]+\.services\.livejournal\.com/ljcounter
        ^https?://[^/]+\.streamtheworld\.com/
        ^https?://[^/]+\.xiti\.com/hit\.xiti\?
        ^https?://[^\./]+\.radioscoop\.(com|net):\d+/
        ^https?://[^\./]+\.streamchan\.org:\d+/
        ^https?://[^\.]+\.livejournal\.com/.+/\*sup_ru/ru/UTF-8/
        ^https?://[^\.]+\.livejournal\.com/.+http://[^\.]+\.livejournal\.com/
        ^https?://[a-z0-9]+\.cdn\.dvmr\.fr(:\d+)?/.+\.mp3
        ^https?://\d+\.media\.tumblr\.com/avatar_.+_16\.pn[gj]$
        ^https?://accounts\.google\.com/(SignUp|ServiceLogin|AccountChooser|a/UniversalLogin)
        ^https?://add\.my\.yahoo\.com/(rss|content)\?
        ^https?://air\.radiorecord\.ru(:\d+)?/
        ^https?://alb\.reddit\.com/
        ^https?://api\.addthis\.com/
        ^https?://audio\d?\.radioreference\.com/
        ^https?://audiots\.scdn\.arkena\.com/
        ^https?://av\.rasset\.ie/av/live/
        ^https?://b\.hatena\.ne\.jp/add\?
        ^https?://b\.scorecardresearch\.com/
        ^https?://beacon\.wikia-services\.com/
        ^https?://bookmark\.naver\.com/post\?
        ^https?://bufferapp\.com/add\?
        ^https?://connect\.mail\.ru/share\?
        ^https?://csp\.cyworld\.com/bi/bi_recommend_pop\.php\?
        ^https?://del\.icio\.us/post\?
        ^https?://delicious\.com/(save|post)\?
        ^https?://download\.ted\.com/
        ^https?://flattr.com/submit/auto\?
        ^https?://gcnplayer\.gcnlive\.com/.+
        ^https?://geo\.yahoo\.com/b\?
        ^https?://getpocket\.com/(save|edit)/?\?
        ^https?://i\.dev\.cdn\.turner\.com/
        ^https?://imageshack\.com/lost$
        ^https?://iwiw\.hu/pages/share/share\.jsp\?
        ^https?://mail\.google\.com/mail/
        ^https?://media\.opb\.org/clips/embed/.+\.js$
        ^https?://medium\.com/_/(vote|bookmark|subscribe)/
        ^https?://memori(\.qip)?\.ru/link/\?
        ^https?://mp3\.ffh\.de/
        ^https?://mp3tslg\.tdf-cdn\.com/
        ^https?://myweb2\.search\.yahoo\.com/myresults/bookmarklet\?
        ^https?://news\.ycombinator\.com/submitlink\?
        ^https?://p\.opt\.fimserve\.com/
        ^https?://photobucket\.com/.+/albums/.+/albums/
        ^https?://pixel\.(quantserve|wp)\.com/
        ^https?://pixel\.blog\.hu/
        ^https?://pixel\.redditmedia\.com/pixel/
        ^https?://platform\d?\.twitter\.com/widgets/tweet_button.html\?
        ^https?://play(\d+)?\.radio13\.ru:8000/
        ^https?://plus\.google\.com/share\?
        ^https?://posterous\.com/share\?
        ^https?://prod-preview\.wired\.com/
        ^https?://pub(\d+)?\.di\.fm/
        ^https?://r-a-d\.io/.+\.mp3$
        ^https?://r-login\.wordpress\.com/remote-login\.php
        ^https?://relay\.broadcastify\.com/
        ^https?://reporter\.es\.msn\.com/\?fn=contribute
        ^https?://s\d+\.sitemeter\.com/(js/counter\.js|meter\.asp)
        ^https?://service\.weibo\.com/share/share\.php\?
        ^https?://share\.flipboard\.com/bookmarklet/popout\?
        ^https?://social-plugins\.line\.me/lineit/share
        ^https?://sphinn\.com/index\.php\?c=post&m=submit&
        ^https?://static\.licdn\.com/sc/p/.+/f//
        ^https?://static\.licdn\.com/sc/p/com\.linkedin\.nux(:|%3A)nux-static-content(\+|%2B)[\d\.]+/f/
        ^https?://stream(\d+)?\.media\.rambler\.ru/
        ^https?://telegram\.me/share/url\?
        ^https?://tm\.uol\.com\.br/h/.+/h/
        ^https?://tmz\.vo\.llnwd\.net/
        ^https?://upload\.wikimedia\.org/wikipedia/[^/]+/thumb/
        ^https?://video-subtitle\.tedcdn\.com/
        ^https?://vkontakte\.ru/share\.php\?
        ^https?://vuible\.com/pins-settings/
        ^https?://web\.archive\.org/web/[^/]+/https?\:/[^/]+\.addthis\.com/.+/static/.+/static/
        ^https?://wow\.ya\.ru/posts_(add|share)_link\.xml\?
        ^https?://www\.addthis\.com/bookmark\.php\?
        ^https?://www\.addtoany\.com/(add_to/|share_save\?)
        ^https?://www\.amazon\.com/.+/logging/log-action\.html
        ^https?://www\.blinklist\.com/index\.php\?Action=Blink/addblink\.php
        ^https?://www\.blogger\.com/feeds/\d+/\d+/comments/default/\d+
        ^https?://www\.blogger\.com/feeds/\d+/posts/default/\d+
        ^https?://www\.dreamwidth\.org/tools/(memadd|tellafriend)\?
        ^https?://www\.flickr\.com/(explore/|photos/[^/]+/(sets/\d+/(page\d+/)?)?)\d+_[a-f0-9]+(_[a-z])?\.jpg$
        ^https?://www\.flickr\.com/change_language\.gne
        ^https?://www\.google\.com/(reader/link\?|buzz/post\?)
        ^https?://www\.google\.com/accounts/AccountChooser
        ^https?://www\.google\.com/bookmarks/mark\?
        ^https?://www\.google\.com/recaptcha/(api|mailhide/d\?)
        ^https?://www\.infomous\.com/cloud_widget/lib/lib/
        ^https?://www\.khaleejtimes\.com/.+/images/.+/images/
        ^https?://www\.khaleejtimes\.com/.+/imgactv/.+/imgactv/
        ^https?://www\.khaleejtimes\.com/.+/kt_.+/kt_
        ^https?://www\.khanacademy\.org(/.*|/)page/%d/$
        ^https?://www\.khanacademy\.org/(wp-admin/|wp-login\.php\?)
        ^https?://www\.khanacademy\.org/.*%5Cx26route=/archive
        ^https?://www\.khanacademy\.org/.*&amp;amp;amp;
        ^https?://www\.khanacademy\.org/.*(\?|%5Cx26)route=(/page/:page|/archive/:year/:month|/tagged/:tag|/post/:id|/image/:post_id)
        ^https?://www\.khanacademy\.org/.*amp%3Bamp%3Bamp%3B
        ^https?://www\.khanacademy\.org/.+/%3Ca%20href=
        ^https?://www\.khanacademy\.org/.+/jetpack-comment/\?blogid=\d+&postid=\d+
        ^https?://www\.khanacademy\.org/.+/plugins/ultimate-social-media-plus/.+/like/like/
        ^https?://www\.khanacademy\.org/.+/quote-comment-\d+/$
        ^https?://www\.khanacademy\.org/.+[\?&](replyto(com)?|like_comment)=\d+
        ^https?://www\.khanacademy\.org/.+[\?&]mode=reply
        ^https?://www\.khanacademy\.org/.+[\?&]share=[a-z]{4,}
        ^https?://www\.khanacademy\.org/.+\?showComment(=|%5C)\d+
        ^https?://www\.khanacademy\.org/search(/label/[^\?]+|\?q=[^&]+|)[\?&]updated-(min|max)=\d{4}-\d\d-\d\dT\d\d:\d\d:\d\d.*&max-results=\d+
        ^https?://www\.linkedin\.com/(cws/share|shareArticle)\?
        ^https?://www\.livejournal\.com/(tools/memadd|update|(identity/)?login)\.bml\?
        ^https?://www\.netvibes\.com/subscribe\.php\?
        ^https?://www\.newsvine\.com/_wine/save\?
        ^https?://www\.odnoklassniki\.ru/dk\?st\.cmd=addShare
        ^https?://www\.warnerbros\.com/\d+$
        ^https?://www\.youtube\.com/.*\[\[.+\]\]
        ^https?://www\.youtube\.com/.*\{\{.+\}\}
        ^https?://zakladki\.yandex\.ru/newlink\.xml\?
Disconnected from ws:// server: ConnectionRefusedError(111, "Connect call failed ('127.0.0.1', 29000)")
Imported /home/thetechrobo/grab-site/www.khanacademy.org-test-prep-mcat-2021-05-05-37e0452b/max_content_length
https://www.khanacademy.org/test-prep/mcat ...
/home/thetechrobo/gs-venv/lib/python3.7/site-packages/wpull/protocol/http/client.py:185: UserWarning: HTTP session did not complete.
  warnings.warn(_('HTTP session did not complete.'))
Disconnected from ws:// server: ConnectionRefusedError(111, "Connect call failed ('127.0.0.1', 29000)")
ERROR Fatal exception.
Traceback (most recent call last):
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 1936, in _do_pre_synchronize_evaluate
    eval_condition = evaluator_compiler.process(*crit)
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/sqlalchemy/orm/evaluator.py", line 85, in process
    return meth(clause)
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/sqlalchemy/orm/evaluator.py", line 181, in visit_binary
    map(self.process, [clause.left, clause.right])
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/sqlalchemy/orm/evaluator.py", line 85, in process
    return meth(clause)
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/sqlalchemy/orm/evaluator.py", line 88, in visit_grouping
    return self.process(clause.element)
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/sqlalchemy/orm/evaluator.py", line 83, in process
    "Cannot evaluate %s" % type(clause).__name__
sqlalchemy.orm.evaluator.UnevaluatableError: Cannot evaluate Select

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/wpull/application/app.py", line 157, in run
    yield from pipeline.process()
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/wpull/pipeline/pipeline.py", line 194, in process
    yield from self._process_one_worker()
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/wpull/pipeline/pipeline.py", line 215, in _process_one_worker
    task.result()
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/wpull/pipeline/pipeline.py", line 119, in process
    item = yield from self.process_one(_worker_id=worker_id)
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/wpull/pipeline/pipeline.py", line 103, in process_one
    yield from task.process(item)
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/wpull/application/tasks/download.py", line 421, in process
    yield from session.app_session.factory['Processor'].process(session)
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/wpull/processor/delegate.py", line 29, in process
    return (yield from processor.process(item_session))
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/wpull/processor/web.py", line 91, in process
    return (yield from session.process())
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/wpull/processor/web.py", line 185, in process
    yield from self._process_loop()
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/wpull/processor/web.py", line 244, in _process_loop
    exit_early, wait_time = yield from self._fetch_one(cast(Request, self._item_session.request))
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/wpull/processor/web.py", line 308, in _fetch_one
    action = self._handle_response(request, response)
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/wpull/processor/web.py", line 410, in _handle_response
    self._item_session.update_record_value(status_code=response.status_code)
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/wpull/pipeline/session.py", line 176, in update_record_value
    self.app_session.factory['URLTable'].update_one(self.url_record.url, **kwargs)
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/wpull/database/wrap.py", line 72, in update_one
    return self.url_table.update_one(*args, **kwargs)
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/wpull/database/sqltable.py", line 196, in update_one
    session.execute(query)
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1638, in execute
    _parent_execute_state is not None,
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 1821, in orm_pre_session_exec
    update_options,
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 1949, in _do_pre_synchronize_evaluate
    from_=err,
  File "/home/thetechrobo/gs-venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
    raise exception
sqlalchemy.exc.InvalidRequestError: Could not evaluate current criteria in Python: "Cannot evaluate Select". Specify 'fetch' or False for the synchronize_session execution option.
CRITICAL Sorry, Wpull unexpectedly crashed.

Sorry if this is a stupid question, i'm a newcomer.

@TheTechRobo TheTechRobo changed the title Can't backup site Can't evaluate Select May 5, 2021
@ivan
Copy link
Contributor

ivan commented May 5, 2021

Thank you for the report, I believe this is caused by a sqlalchemy 1.4 incompatibility as found in ArchiveTeam/wpull#463.

I will try to get this fixed soon. In the meantime, the Nix-based grab-site install might work (it still has the older sqlalchemy): https://github.com/ArchiveTeam/grab-site#install-on-another-distribution-lacking-python-37x

@TheTechRobo
Copy link
Contributor Author

TheTechRobo commented May 5, 2021

Running pip3 uninstall sqlalchemy followed by pip3 install sqlalchemy==1.3.\* works for me. I'm still puzzled over the Disconnected from ws:// server: ConnectionRefusedError(111, "Connect call failed ('127.0.0.1', 29000)"), I assume this a separate issue?

@TheTechRobo
Copy link
Contributor Author

Figured it out. gs-server wasn't running.

@ivan
Copy link
Contributor

ivan commented May 10, 2021

This should be fixed in grab-site 2.2.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants