-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for improved content organization #1
Comments
Hi Tibso, Thank you for your feedback. I am agreed with your viewpoint on the need for restructuring. In fact, these adjustments are part of several other changes that will be integrated to enrich the current data with some meta/extra data and others formats. I haven't had the chance to update this in the current version yet. Short-term Changes:In the immediate future, I plan to relocate data from public.dir/* to the root directory, enhancing accessibility. Additionally, for improved clarity, the "DIGISQUAD-COM" string in filenames will likely be removed. Categories and Data Source:These categories, like ads, malware, and adult content, are ones I've been thinking about adding to the system. Do you have specific data sources you'd recommend for these? About the option 1 :The idea is to have a directory-file-based organization like: [ dataType (directory) ] / [ category (file) ] :
I find this structure efficient, but if you have concerns, please do share. About the option 2 :Recognizing the significance of prefixes for Redis users, I could append .redis during list generation. This could yield:
But for consistency, the data type might be prefixed, leading to:
Option 3 :Based on your feedback and the existing challenges, I'm contemplating a more tailored solution. You could specify your desired format, and I'd accommodate it in a directory, potentially named dnsblacklist-rs. The naming convention would be something close to : [ dataType ] / [dataFormat ] / [category (file) ] Examples :
Specifically for your use-case:
Then I could implement your prepend suggestion like :
But you probably then prefer a single consolidated file containing domains from all categories, instead of separate files per category, right ? If this latest solution suits your needs, I'll implement it based on my availability. |
Here's my assessment of the available options: Option 1:The first option appears to represent the most efficient approach for data organization. The utilization of distinct directories effectively eliminates the necessity for end-users to engage in the inconvenient task of data sorting. Furthermore, this option could hold particular appeal for users who would prefer not to contend with a single extensive file. Such an approach vaguely aligns with the methodology of maintaining a record count under a predetermined threshold, such as 50k or 100k, as per your existing practice. Option 2:While prefixes are undeniably useful for Redis users, I am inclined to believe that introducing a separate directory solely for Redis users may not warrant the added complexity. Achieving the desired outcome can be accomplished by simply prefixing the data type to each individual record. Option 3:Similar to my assessment of Option 2, I hold the view that tailored solutions may not be justified given the ensuing increase in complexity. The introduction of directories for specific use-cases would lead to an increase of the repository's size, despite the fact that the underlying data remains nearly identical. For my specific use-case, it is preferable to maintain separate files to avoid the need for extensive data sorting. Given these considerations, my recommendation is to opt for the first option, as I am inclined to believe that it holds the most promise. |
Regarding the data sources to utilize, firebog.net appears to be the most comprehensive compilation of domains suitable for blocking that I have come across. Moreover, it would be highly advantageous if the MISP warning lists could be incorporated to identify potential false positives. Finally, it would be of great interest to receive real-time updates from MISP concerning IP addresses and domains that should be blocked. |
Hi T, Ok, thank you, I will review this data source. Note that I have others sources that may align more specifically with your case(s), but I have time constraints that limited my actions for now. But don't forget, it's not an issue to accumulate vast amounts of data but it's crucial that this data remains up-to-date and efforts are made to minimize false positives.
That said, as explain to your boss, should there be any pressing updates or features needed, I am open to discussions regarding professionnal support on this kind of data. After all, one has to earn a living :) |
Greetings,
I would like to propose an enhancement to your repository.
I believe it would significantly improve data usability to establish a clear association between FQDNs, domains, IPs, and the corresponding types of content they represent. I have identified 2 methods to accomplish this.
This could be achieved through the implementation of separate directories or linking mechanisms.
Option 1: Sorting using directories
One approach would involve segregating these items into separate directories, thereby allowing for a more logical retrieval. For instance, a domain associated with ads could be stored as follows:
public.dir/domain/ads/DIGISQUAD-COM-malicious
Option 2: Prepending the type
Alternatively another approach would involve prefixing each domain with its content type:
ads:adspam.com
malware:getpwnd.net
This restructuring would enable a finer level of control when it comes to filtering content.
Personally, I am particularly interested in the filtering of ads, malware, and adult content.
However, it would be advantageous to have the flexibility to accommodate additional categories as needed.
Would you consider implementing this modification?
Additionally, I observed that the domains in the domain directory do not appear to encompass the domains listed in the fqdn directory, and vice versa. Could this be an unintended behavior?
The text was updated successfully, but these errors were encountered: