Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What does Google rely on to classify the web-page? #50

Closed
075KG opened this issue Mar 15, 2017 · 4 comments
Closed

What does Google rely on to classify the web-page? #50

075KG opened this issue Mar 15, 2017 · 4 comments

Comments

@075KG
Copy link

075KG commented Mar 15, 2017

Hi~
We can get three kinds of results from sbserver and they are MALWARE-ANY_PLATFORM-URL, UNWANTED_SOFTWARE-ANY_PLATFORM-URL, SOCIAL_ENGINEER-ANY_PLATFORM-URL.
So, how does Google server classify the web-page into these three types?

Thx
@colonelxc
Copy link
Contributor

This has a very high level description of our process: https://www.google.com/transparencyreport/safebrowsing/faq/?hl=en

Here's a link to our policies around these categories: https://safebrowsing.google.com/#policies

@075KG
Copy link
Author

075KG commented Mar 20, 2017

Thanks a lot, it helps me so much !!!^.^
But may be it is not so clear to describe how it works. Two examples,
1、How do you determine that a site is unsafe?
For malware sites, we scan sections of our web index to identify potentially compromised websites. Then we test those sites by using a virtual machine to see if the machine gets infected. We use statistical models to identify phishing sites.

So, how does Google identify phishing sites by statistical models?

2、How do you tell the difference between a compromised site and an attack site?
Our scanners can differentiate between the sites that exploit the browser and those that are compromised so that they lead to exploited sites. Sites that exploit the browser are attack sites.

How scanners work to differentiate between the sites that exploit the browser and those that are compromised?

@alexwoz
Copy link
Collaborator

alexwoz commented Mar 20, 2017

Hi @075KG,

We don't provide much detail with regards to our detection infrastructure. However, we have published some papers that you may find interesting:

@075KG
Copy link
Author

075KG commented Mar 23, 2017

Thanks a lot~ @alexwoz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants