Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RegEx matches domain names and IP addresses #2463

Closed
xCh12i5 opened this issue Oct 14, 2018 · 5 comments
Closed

RegEx matches domain names and IP addresses #2463

xCh12i5 opened this issue Oct 14, 2018 · 5 comments
Assignees
Labels
Investigating Open issue or bug, under investigation triage: Issue Template Ignored Missing Issue Template

Comments

@xCh12i5
Copy link
Contributor

xCh12i5 commented Oct 14, 2018

sed -nr -e 's/\.{2,}/./g' -e '/\./p' > ${destination}

In the gravity_ParseFileIntoDomains() function the regular expression sed -nr -e 's/\.{2,}/./g' -e '/\./p' matches domain names as well as IP addresses. In case it is intended to match exclusively domain names the regular expression should look like sed -r '/([^\.]+\.)+[^\.]{2,}/!d'.

Proof of concept with the original RegEx:
echo "google.com" | awk -F '#' '{print $1}' | awk -F '/' '{print $1}' | awk '($1 !~ /^#/) { if (NF>1) {print $2} else {print $1}}' | sed -nr -e 's/\.{2,}/./g' -e '/\./p'
google.com
echo "8.8.8.8" | awk -F '#' '{print $1}' | awk -F '/' '{print $1}' | awk '($1 !~ /^#/) { if (NF>1) {print $2} else {print $1}}' | sed -nr -e 's/\.{2,}/./g' -e '/\./p'
8.8.8.8

Proof of concept with the proposed RegEx:
echo "8.8.8.8" | tr -d '\r' | tr '[:upper:]' '[:lower:]' | sed -r '/(\/|#).*$/d' | sed -r 's/^.*\s+//g' | sed -r '/([^\.]+\.)+[^\.]{2,}/!d'
echo "google.com" | tr -d '\r' | tr '[:upper:]' '[:lower:]' | sed -r '/(\/|#).*$/d' | sed -r 's/^.*\s+//g' | sed -r '/([^\.]+\.)+[^\.]{2,}/!d'
google.com

@dschaper dschaper added the triage: Issue Template Ignored Missing Issue Template label Oct 17, 2018
@dschaper
Copy link
Member

Does this manifest as a bug or an issue that needs to be resolved? What are the benefits for implementing the proposed changes?

@dschaper dschaper added the Submitter Attention Required Need action from submitter to continue label Jan 14, 2019
@dschaper dschaper self-assigned this Jan 18, 2019
@dschaper dschaper added Investigating Open issue or bug, under investigation and removed Submitter Attention Required Need action from submitter to continue labels Jan 18, 2019
@dschaper
Copy link
Member

No response. Will implement. This is a better approach to select the domain name from the line in the hosts file instead of splitting and hoping the file is in IP FQDN format. Should be able to strip comments and then parse out the domain name only and remove a lot of the extra steps.

@AzureMarker
Copy link
Contributor

See my comment here about reusing the domain and hostname detection implemented in the API: #2495 (comment)

@dschaper
Copy link
Member

Thanks, I think this sed approach will work better for shell script. Once the API is done and released we can take a look at what shell can be replaced or eliminated.

@xCh12i5 xCh12i5 mentioned this issue Jan 18, 2019
8 tasks
@dschaper
Copy link
Member

Thanks for the PR xCh12i5!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Investigating Open issue or bug, under investigation triage: Issue Template Ignored Missing Issue Template
Projects
None yet
Development

No branches or pull requests

4 participants