Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elastic changes #222

Merged
merged 9 commits into from Nov 9, 2019

Conversation

@0xdade
Copy link
Member

0xdade commented Nov 6, 2019

A bunch of changes to support improved elasticsearch interfacing. This set of changes should enable #219 to be done more easily as the types of queries we execute are separated out into their own functions. This also uses a base function called execute_raw_query through which all of the elastic interfacing happens. This means that we can have our status check and exception handling in a single function instead of in all of our functions.

Also introduced some changes to routing, via /browse endpoint becoming separate from the /search endpoint. This occurs in this PR because I was making changes to the elastic.py search function and realized that a default argument made more sense than checking if the argument was empty and then setting a default value based on that. But that introduced some other issues with the way queries worked for "browse" versus "search" capability. By separating them, we'll get simpler functions, simpler templates, and move closer towards single-purpose routes.

It also includes a fix for #221 and many similar failures that were related to it.

@ajacques Let me know if this set of changes to the elastic class makes it easier to implement the performance tracing stuff we talked about. I tried to follow your design pattern a bit, but also took a couple liberties in what I think are improvements (i.e. not doing tracing at the raw query level but rather at the execute_<query type> level, so that you can augment the data according to the type of results we expect to get from that type of query.

0xdade added 3 commits Nov 6, 2019
…he future

There is still work left to be done, I've only really implemented search and count right now. We also need to implement delete_by_query, index, and probably a few others. This commit also introduces a dedicated "browse" endpoint that is separate from the search endpoint. While similar in function, the search endpoint was getting unruly to manage both basic browse plus search. This allows us to simplify a bit both in the code as well as in the templates. There are definitely some abstractions left to do to simplify this further, though. Such as url generation for pagination, which is relatively similar all over the place and can be abstracted a little bit.
@0xdade 0xdade requested a review from ajacques Nov 6, 2019
…tracing.

It also does a number of renames to make things more consistently named.
natlas-server/app/elastic.py Outdated Show resolved Hide resolved
natlas-server/app/elastic.py Outdated Show resolved Hide resolved
natlas-server/app/elastic.py Outdated Show resolved Hide resolved
natlas-server/app/elastic.py Outdated Show resolved Hide resolved
natlas-server/app/elastic.py Outdated Show resolved Hide resolved
@0xdade 0xdade force-pushed the 0xdade:elastic_changes branch from d8c7ce2 to e77f74e Nov 6, 2019
natlas-server/app/elastic.py Outdated Show resolved Hide resolved
natlas-server/app/elastic.py Outdated Show resolved Hide resolved
natlas-server/app/elastic.py Outdated Show resolved Hide resolved
natlas-server/app/elastic.py Outdated Show resolved Hide resolved
natlas-server/app/elastic.py Outdated Show resolved Hide resolved
0xdade added 2 commits Nov 6, 2019
…uses.

Also removes ELASTICSEARCH_URL from the database configuration because it doesn't make sense to change the elastic node(s) you're point to while you're already running. In almost every case, it's going to be set once and rarely changed.
self.client.execute_index(index='nmap', id=ip, body=host)
return True

def get_host(self, ip):

This comment has been minimized.

Copy link
@codeclimate

codeclimate bot Nov 6, 2019

Similar blocks of code found in 2 locations. Consider refactoring.

result = self.client.get_collection(index="nmap_history", body=searchBody, _source=source_fields, track_scores=False)
return result

def get_host_by_scan_id(self, scan_id):

This comment has been minimized.

Copy link
@codeclimate

codeclimate bot Nov 6, 2019

Similar blocks of code found in 2 locations. Consider refactoring.

natlas-server/app/elastic/client.py Show resolved Hide resolved
natlas-server/app/elastic/client.py Show resolved Hide resolved
natlas-server/app/elastic/client.py Show resolved Hide resolved
0xdade added 2 commits Nov 7, 2019
Fixes #228. This was an oversight and should be closed. Piggybacking into this PR to avoid merge conflicts.
natlas-server/app/elastic/interface.py Show resolved Hide resolved
def delete_scan(self, scan_id):
''' Delete a specific scan, if it's the most recent then try to migrate the next oldest back into the nmap index '''
migrate = False
searchBody = {

This comment has been minimized.

Copy link
@codeclimate

codeclimate bot Nov 7, 2019

Similar blocks of code found in 2 locations. Consider refactoring.

# we're deleting the most recent scan result and need to pull the next most recent into the nmap index
# otherwise you won't find the host when doing searches or browsing
ipaddr = host['ip']
secondBody = {

This comment has been minimized.

Copy link
@codeclimate

codeclimate bot Nov 7, 2019

Similar blocks of code found in 2 locations. Consider refactoring.

searchBody = {
"size": limit,
"from": offset,
"query": {

This comment has been minimized.

Copy link
@codeclimate

codeclimate bot Nov 7, 2019

Similar blocks of code found in 2 locations. Consider refactoring.

"size": 1,
"query": {
"function_score": {
"query": {

This comment has been minimized.

Copy link
@codeclimate

codeclimate bot Nov 7, 2019

Similar blocks of code found in 2 locations. Consider refactoring.

natlas-server/app/admin/routes.py Show resolved Hide resolved
natlas-server/README.md Outdated Show resolved Hide resolved
except Exception:
self.status = False
Comment on lines +27 to +28

This comment has been minimized.

Copy link
@ajacques

ajacques Nov 8, 2019

Member

One issue with this is that it doesn't bubble up during startup, nor get reported anyway. It just silently gets ignored.

Can we

  1. Bubble this up and fail startup?
  2. Log to console/sentry?

Option 1 would end in a boot loop in Kubernetes/Docker. What happens if the SQL DB is in-op? Does it fail to startup too?

This comment has been minimized.

Copy link
@0xdade

0xdade Nov 8, 2019

Author Member

I think this is a good question that looks at the level of dependency we want in the flask app itself. Without elastic, there's no where to put the data so there's not much value in running the flask app. Without SQL, there's no where to store the configuration or the scoping, so there's also not much value in running the flask app. I think it's fair to make them both hard failures at startup as natlas can't really work as intended without each data store. Alternatively we can handle them in a more automatic resilient way and have SQL DB failures also report a 503 that way if either data store is down then you get a 503 and we can forward the exceptions to sentry without taking down the application itself.

natlas-server/app/elastic/client.py Outdated Show resolved Hide resolved
natlas-server/app/elastic/client.py Outdated Show resolved Hide resolved
natlas-server/app/elastic/interface.py Outdated Show resolved Hide resolved
natlas-server/app/elastic/interface.py Outdated Show resolved Hide resolved
}
}
}
result = self.client.execute_delete_by_query(index="nmap,nmap_history", body=deleteBody)

This comment has been minimized.

Copy link
@ajacques

ajacques Nov 8, 2019

Member

Avoid delete_by_query when you already know a document query. delete_by_queries require a full query and then delete, but delete by doc id can directly find the document and delete it. Additionally, you have an additional level safety that you don't accidentally delete more than one record.

@codeclimate

This comment has been minimized.

Copy link

codeclimate bot commented Nov 8, 2019

Code Climate has analyzed commit 4c8e43d and detected 10 issues on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 1
Duplication 9

View more on Code Climate.

@ajacques ajacques merged commit 10532e5 into natlas:main Nov 9, 2019
6 checks passed
6 checks passed
LGTM analysis: JavaScript No new or fixed alerts
Details
LGTM analysis: Python No new or fixed alerts
Details
ci/circleci: general-analysis Your tests passed on CircleCI!
Details
ci/dockercloud (/natlas-agent) Your tests passed in Docker Cloud
Details
ci/dockercloud (/natlas-server) Your tests passed in Docker Cloud
Details
codeclimate Approved by dade.
Details
@0xdade 0xdade deleted the 0xdade:elastic_changes branch Nov 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.