Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URLHeadBear: fake the user agent #2879

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Akhelesh
Copy link

A random fake user agent is used to request webpages
that do not allow bot visits.

Fixes #1203

For short term contributors: we understand that getting your commits well
defined like we require is a hard task and takes some learning. If you
look to help without wanting to contribute long term there's no need
for you to learn this. Just drop us a message and we'll take care of brushing
up your stuff for merge!

Checklist

  • I read the commit guidelines and I've followed
    them.
  • I ran coala over my code locally. (All commits have to pass
    individually.
    It is not sufficient to have "fixup commits" on your PR,
    our bot will still report the issues for the previous commit.) You will
    likely receive a lot of bot comments and build failures if coala does not
    pass on every single commit!

After you submit your pull request, DO NOT click the 'Update Branch' button.
When asked for a rebase, consult coala.io/rebase
instead.

Please consider helping us by reviewing other peoples pull requests as well:

The more you review, the more your score will grow at coala.io and we will
review your PRs faster!

@gitmate-bot
Copy link
Collaborator

Comment on 5d00e26.

Shortlog of HEAD commit does not match given regex: ([^:]|[^:]+: [A-Z0-9].*)

Origin: GitCommitBear, Section: all.commit.

@gitmate-bot
Copy link
Collaborator

Comment on 87fa05e.

Shortlog of HEAD commit does not match given regex: ([^:]|[^:]+: [A-Z0-9].*)

Origin: GitCommitBear, Section: all.commit.

1 similar comment
@gitmate-bot
Copy link
Collaborator

Comment on 87fa05e.

Shortlog of HEAD commit does not match given regex: ([^:]|[^:]+: [A-Z0-9].*)

Origin: GitCommitBear, Section: all.commit.

Copy link
Contributor

@frextrite frextrite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking at the said requirements you may also want to squash the commits.

@@ -11,6 +11,7 @@ cpplint~=1.3
dennis~=0.9
docutils-ast-writer~=0.1.2
eradicate~=0.1.6
fake-useragent~=0.1.11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

headers present in fake-useragent are pretty old. Do we really want to use it?

@@ -82,8 +83,9 @@ def check_prerequisites(cls):
@staticmethod
def get_head_response(url, timeout):
try:
headers = {'User-Agent': UserAgent().random}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And why are we using a random UserAgent? Using a specific browser header like firefox/chrome would be much better.

On that note, why do we even need to generate a user-agent. Why not just hardcode the latest chrome header?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@frextrite I tried using a random user agent since some websites block the user if it makes many requests but I Googled more to find out that it would be useless unless IP addresses are rotated too. So in this case your solution suits better.

A random fake user agent is used to request webpages
that do not allow bot visits.

Fixes coala#1203
Copy link
Contributor

@frextrite frextrite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more changes,

  1. change the commit body according to updated changes
  2. read, follow and tick the items in PR body
  3. look at Travis CI build logs and fix the failing tests,
    and you'll be good to go.

@sladyn98
Copy link

sladyn98 commented Mar 3, 2019

You could runcoala after you make all the changes and finish up commiting the changes.That would make it easier after pushing.
Also in terms of the testing
headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36'})
Add these headers to your assertion tests.Travis seems to be failing because of that.

@Akhelesh
Copy link
Author

Akhelesh commented Mar 4, 2019

You could runcoala after you make all the changes and finish up commiting the changes.That would make it easier after pushing.
Also in terms of the testing
headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36'})
Add these headers to your assertion tests.Travis seems to be failing because of that.

Yeah I have created an issue for the same to keep it separate from this one.

Copy link
Contributor

@frextrite frextrite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think instead of creating a separate issue you may want to work on the tests in this PR directly. Since the changes in this PR are directly affecting the tests, they should be resolved in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

URLHeadBear: fake the user-agent
5 participants