-
-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About Super.Hosts (included in Blackweb) #6
Comments
Nice 👍 😸 @mitchellkrogza is on holiday but I think he would say : AWESOME 👍 💯 🥇 ⭐ 🏆 I'm working on a better way to clean this repository with funceble so maybe when he come from holiday my next release or pre-release will handle big file automation with travis 😸 |
Great. To express our congratulations |
That's awesome news. Yes I am indeed on holiday but this is certainly great
to hear. Still lots of fine tuning needed on this list but so glad it been
included as a source for your project.
|
We have also included the following sources: |
Hi @maravento I am back from trip away, 👍 thanks so much for including this and my other sources in your BlackWeb Project. I have pushed out some new updates today, removed a few false positives and the list size of Ultimate Hosts has grown again with fresh data from remote sources. |
great news. |
@maravento been doing some big cleaning up and de-duping. Busy with another big clean of dead and expired domains but that's going to take some time to complete but you will see the list is reduced somewhat and has not one dupe. Found an error in my scripting where some input files were in a DOS format causing dupes to be created but all fixed now. |
thanks for the info. I will update blackweb, but there is a problem. I would need the list of excluded domains to exclude them also from blackweb |
@maravento Can you possibly send me the list you have so I can run it against current list and provide you with the removed stuff? Does blackweb not pull it from the repo and update it? I've provided raw links in the README to all the raw files. |
Blackweb does not eliminate. Only adds bad domains (except whitelist) PD: By the way 7.827.420 black domains (Blackweb download many sources) |
@maravento it's taken several weeks of work but we now have a central repo for controlling removals of dead domains, dead blogspot domains & more coming. This central repo also controls whitelisting and removal of false positives. So all projects now draw from this central information and have list cleaning functions which do the stripping and removals on every build. The central control system can be found at : https://github.com/mitchellkrogza/CENTRAL-REPO.Dead.Inactive.Whitelisted.Domains.For.Hosts.Projects and we'd love any contributions or additions from you, simply send PR's wherever applicable. Right now on dead domains we have a 99% accuracy rate. Each list will be re-tested from time to time to check for domains or web sites that have become re-active and they will be added the re-active-domains list. This now gives much better control across all repo's and reduces any whitelisted or false positive domains ever being re-added to a list by mistake. |
So @maravento you can still do regular pulls of the domains lists from: Nginx Ultimate Bad bot Blocker And then run your removals / cleaning using CENTRAL REPO - Dead Domain - All Combined |
Great news. On Monday we will update blackweb project with the new repositories. |
Hi @maravento yes thats the same repo. If you look at the Readme on https://github.com/mitchellkrogza/Ultimate.Hosts.Blacklist you can see the different raw files. The super hosts file is comprised of domain names plus IP addresses |
This Raw file is a plain text lists of domain names only And this raw file is a plain text lists if bad IP's only Those two lists have no commenting or anything and don't include the 0.0.0.0 domain.com or 0.0.0.0 IP.IP.IP.IP So for your uses pulling those might be easiest |
This Raw file is a plain text lists of domain names only please check the clarification. both links are the same |
Sorry working off my mobile IP List only Domains list only |
better. So, with your permission, we will include in the next update of blackweb, the new repositories: And in the blackip, the new repository: |
Absolutely no permission required and please let me know how it works out |
@maravento added a new whitelisted IP ranges file today > https://github.com/mitchellkrogza/CENTRAL-REPO.Dead.Inactive.Whitelisted.Domains.For.Hosts.Projects/blob/master/whitelisted-ip-ranges-ALL-combined.txt |
Great. About ip.list |
About whitelisted IPs/CIDR: https://github.com/mitchellkrogza/CENTRAL-REPO.Dead.Inactive.Whitelisted.Domains.For.Hosts.Projects/blob/master/whitelisted-ip-ranges-ALL-combined.txt In our project whiteip IPs/CIDR Add from Central-Repo: Problems detected in Central-Repo: 66.249.64.0/18 and 66.249.80.0/20 (should be 66.249.64.0/19) |
@maravento please check now if the list looks better / correct > https://hosts.ubuntu101.co.za/ips.list I used the following
|
A suggestion. Separate ipv4 and ipv6 lists (because debugging is different in both) |
👍 @maravento shall implement tomorrow |
@maravento have we got a better layout now on CENTRAL REPO ?? Have a look and give me some feedback please. https://github.com/mitchellkrogza/CENTRAL-REPO.Dead.Inactive.Whitelisted.Domains.For.Hosts.Projects @funilrys any feedback / comments from you on this new layout? |
Once you guys both happy, I can re-write the cleaning functions on all my other repo's including this one of course. |
I think you have done a great job of debug and reorganizing. i congratulate you |
Thanks so much @maravento much appreciated and truly appreciate all your input . Have a great weekend |
Hi. |
Thanks @maravento all those lists are currently undergoing re-testing. We will soon have them fixed and keep certain domains on the list regardless of if their state changes. It's a big Work in progress this one. |
I think this is where we will use https://github.com/mitchellkrogza/CENTRAL-REPO.Dead.Inactive.Whitelisted.Domains.For.Hosts.Projects/blob/master/DOMAINS-re-active.txt to keep a list of permanently blacklisted domains and also add to it from our re-tests when previously inactive are now active again. |
that's too much debugging work |
Gotta re-think this whole thing a bit. No need to tell me how much work this is turning out to be 😂 |
I give you the same suggestion I gave you days ago: |
@maravento thanks going to take a fresh start with the dead domains control manner on central repo. As always appreciate your input and help. 👍 |
@maravento I took a fresh start to CENTRAL REPO now, focusing mostly on dead blogspot domains which we know 100% are dead and the list of invalid domains. Now a much smaller list of dead domains and will just focus on invalid domains. https://github.com/mitchellkrogza/CENTRAL-REPO.Dead.Inactive.Whitelisted.Domains.For.Hosts.Projects |
I have tried the new version of DOMAINS-dead.txt and it goes perfect (I have verified the domains and they are really dead or invalid). I included it in the next update of Blackweb |
Thanks so much @maravento and thanks again for your valuable insight and help. It's now a much smaller list and re-testing it will only ever take a few hours. I have some tests running on the complete Ultimate hosts list which will give us some additional invalid domains which will be tested once more before being added to domains-invalid. 👍 |
Hi. @mitchellkrogza There are many duplicate IPs / CIDRs in the ips.list. This happens because many lines have a space at the end and generate duplicates. Example: |
Thank @maravento I found the problem, was only one of the input sources yoyo.org which had spaces. All spaces removed and new build in progress. |
i found in ips.list: Also i found 0.0.0.0/8, 10.0.0.0/8, 100.64.0.0/10, 192.0.0.0/24, 192.0.2.0/24, 192.168.0.0/16, etc (Reserved IP addresses) (?) |
Thanks @maravento currently ips.list is combined ipv4 and ipv6 and also needs better sorting which I will do tomorrow. I will also address those reserved ranges. Can you spot any more dupes? |
I think all those reserved IP ranges originate from the yoyo.org input source. Will have it sorted tomorrow. |
I think that's all |
I recommend using this pipe to discard anything unusual (ipv6, spaces...): |
Great work. We have included your list (Ultimate Super.Hosts Blacklist ) in Data sheet (sources) of our Blackweb project based on Squid-Cache for Linux
Special Thanks
The text was updated successfully, but these errors were encountered: