Initialization of ES

In [1]:
from elasticsearch import Elasticsearch

client = Elasticsearch(['elasticsearch'])
indice = "syslog-*"

1. Query All 

In [2]:
query = {
    "query": {
        "match_all": {}
    }
}

res = client.search(index=indice, body=query, scroll='100m', size=10000)

print("Got %d Hits:" % res['hits']['total']['value'])
    
sid = res['_scroll_id']
scroll_size = len(res['hits']['hits'])

while scroll_size > 0:
    "Scrolling..."
    print(scroll_size)
    data = client.scroll(scroll_id=sid, scroll='2m')
    # Update the scroll ID
    sid = data['_scroll_id']
    # Get the number of results that returned in the last scroll
    scroll_size = len(data['hits']['hits'])

Got 35119 Hits:
10000
10000
10000
5119


2. Match Query

In [3]:
query = {
    "query": {
        "match": {
            "hostname":"for.org"
        }
    }
}

res = client.search(index=indice, body=query)

print("Got %d Hits:" % res['hits']['total']['value'])
for hit in res['hits']['hits']:
    print("%(timestamp)s %(raw)s" % hit["_source"])

Got 1108 Hits:
2021-07-21T08:45:24.811Z <23>1 2021-07-21T08:45:24.811Z for.org shaneIxD 3092 ID491 - There's a breach in the warp core, captain
2021-07-21T08:45:24.981Z <115>1 2021-07-21T08:45:24.981Z for.org ahmadajmi 7604 ID700 - A bug was encountered but not in Vector, which doesn't have bugs
2021-07-21T08:45:25.011Z <178>1 2021-07-21T08:45:25.011Z for.org Karimmove 4403 ID240 - We're gonna need a bigger boat
2021-07-21T08:45:25.091Z <18>1 2021-07-21T08:45:25.091Z for.org ahmadajmi 8081 ID449 - Pretty pretty pretty good
2021-07-21T08:45:25.121Z <91>2 2021-07-21T08:45:25.121Z for.org devankoshal 5769 ID103 - A bug was encountered but not in Vector, which doesn't have bugs
2021-07-21T08:45:25.281Z <22>2 2021-07-21T08:45:25.281Z for.org devankoshal 4576 ID870 - #hugops to everyone who has to deal with this
2021-07-21T08:45:25.511Z <75>1 2021-07-21T08:45:25.511Z for.org jesseddy 3114 ID35 - Great Scott! We're never gonna reach 88 mph with the flux capacitor in its current state!
2021-07

3. Multi Match

In [4]:
query = {
    "query": {
        "multi_match": {
            "query": "up.com ahmadajmi", 
            "fields":["hostname", "application"]
        }
    }
}

res = client.search(index=indice, body=query)

print("Got %d Hits:" % res['hits']['total']['value'])
for hit in res['hits']['hits']:
    print("%(timestamp)s %(raw)s" % hit["_source"])

Got 6768 Hits:
2021-07-21T08:45:25.320Z <143>2 2021-07-21T08:45:25.320Z up.com meln1ks 5366 ID85 - Take a breath, let it go, walk away
2021-07-21T08:45:25.371Z <77>2 2021-07-21T08:45:25.371Z up.com benefritz 5570 ID788 - Maybe we just shouldn't use computers
2021-07-21T08:45:25.811Z <70>2 2021-07-21T08:45:25.811Z up.com jesseddy 2857 ID178 - Great Scott! We're never gonna reach 88 mph with the flux capacitor in its current state!
2021-07-21T08:45:26.401Z <145>2 2021-07-21T08:45:26.401Z up.com ahmadajmi 6473 ID51 - Maybe we just shouldn't use computers
2021-07-21T08:45:26.531Z <41>1 2021-07-21T08:45:26.531Z up.com devankoshal 5170 ID924 - #hugops to everyone who has to deal with this
2021-07-21T08:45:26.641Z <6>1 2021-07-21T08:45:26.641Z up.com benefritz 8272 ID965 - You're not gonna believe what just happened
2021-07-21T08:45:26.780Z <69>2 2021-07-21T08:45:26.780Z up.com jesseddy 2141 ID294 - We're gonna need a bigger boat
2021-07-21T08:45:27.141Z <60>1 2021-07-21T08:45:27.141Z up.com 

4. String Query

In [None]:
query = {
  "query": {
    "query_string": {
      "query": "(for.org) AND (pretty breath) "
    }
  }
}

res = client.search(index=indice, body=query)

print("Got %d Hits:" % res['hits']['total']['value'])
for hit in res['hits']['hits']:
    print("%(timestamp)s %(raw)s" % hit["_source"])

5. Term Query

In [5]:
query = {
   "query":{
      "term":{"message":"pretty"}
   }
}

res = client.search(index=indice, body=query)

print("Got %d Hits:" % res['hits']['total']['value'])
for hit in res['hits']['hits']:
    print("%(timestamp)s %(raw)s" % hit["_source"])

Got 4807 Hits:
2021-07-21T08:45:24.591Z <178>1 2021-07-21T08:45:24.591Z we.com jesseddy 5994 ID544 - Pretty pretty pretty good
2021-07-21T08:45:24.711Z <14>1 2021-07-21T08:45:24.711Z some.org Karimmove 8358 ID568 - Pretty pretty pretty good
2021-07-21T08:45:24.851Z <40>1 2021-07-21T08:45:24.851Z random.us devankoshal 1322 ID451 - Pretty pretty pretty good
2021-07-21T08:45:24.891Z <68>1 2021-07-21T08:45:24.891Z up.us devankoshal 7122 ID704 - Pretty pretty pretty good
2021-07-21T08:45:24.921Z <101>1 2021-07-21T08:45:24.921Z we.com benefritz 3822 ID947 - Pretty pretty pretty good
2021-07-21T08:45:25.091Z <18>1 2021-07-21T08:45:25.091Z for.org ahmadajmi 8081 ID449 - Pretty pretty pretty good
2021-07-21T08:45:25.191Z <156>1 2021-07-21T08:45:25.191Z we.org ahmadajmi 4916 ID95 - Pretty pretty pretty good
2021-07-21T08:45:25.241Z <159>1 2021-07-21T08:45:25.241Z make.com benefritz 4619 ID592 - Pretty pretty pretty good
2021-07-21T08:45:25.251Z <63>1 2021-07-21T08:45:25.251Z make.de shaneIxD 858

6. Range Query

In [6]:
query = {
   "query":{
      "range":{
         "version":{
            "gte":2
         }
      }
   }
}

res = client.search(index=indice, body=query)

print("Got %d Hits:" % res['hits']['total']['value'])
for hit in res['hits']['hits']:
    print("%(timestamp)s %(raw)s" % hit["_source"])

Got 10000 Hits:
2021-07-21T08:45:24.601Z <128>2 2021-07-21T08:45:24.601Z we.org benefritz 3125 ID149 - We're gonna need a bigger boat
2021-07-21T08:45:24.611Z <143>2 2021-07-21T08:45:24.611Z up.org ahmadajmi 8924 ID384 - Take a breath, let it go, walk away
2021-07-21T08:45:24.631Z <6>2 2021-07-21T08:45:24.631Z for.com Karimmove 9308 ID196 - You're not gonna believe what just happened
2021-07-21T08:45:24.651Z <143>2 2021-07-21T08:45:24.651Z we.net jesseddy 310 ID814 - There's a breach in the warp core, captain
2021-07-21T08:45:24.661Z <93>2 2021-07-21T08:45:24.661Z make.org shaneIxD 7673 ID148 - A bug was encountered but not in Vector, which doesn't have bugs
2021-07-21T08:45:24.672Z <58>2 2021-07-21T08:45:24.672Z for.com ahmadajmi 669 ID828 - There's a breach in the warp core, captain
2021-07-21T08:45:24.681Z <160>2 2021-07-21T08:45:24.681Z names.org benefritz 969 ID513 - Great Scott! We're never gonna reach 88 mph with the flux capacitor in its current state!
2021-07-21T08:45:24.701Z 

7. Exist Query

In [7]:
query = {
  "query": {
    "exists": {
      "field": "application"
    }
  }
}

res = client.search(index=indice, body=query)

print("Got %d Hits:" % res['hits']['total']['value'])
for hit in res['hits']['hits']:
    print("%(timestamp)s %(raw)s" % hit["_source"])

Got 10000 Hits:
2021-07-21T08:45:24.581Z <159>1 2021-07-21T08:45:24.581Z for.us meln1ks 9602 ID642 - There's a breach in the warp core, captain
2021-07-21T08:45:24.591Z <178>1 2021-07-21T08:45:24.591Z we.com jesseddy 5994 ID544 - Pretty pretty pretty good
2021-07-21T08:45:24.601Z <128>2 2021-07-21T08:45:24.601Z we.org benefritz 3125 ID149 - We're gonna need a bigger boat
2021-07-21T08:45:24.611Z <143>2 2021-07-21T08:45:24.611Z up.org ahmadajmi 8924 ID384 - Take a breath, let it go, walk away
2021-07-21T08:45:24.621Z <48>1 2021-07-21T08:45:24.621Z up.de ahmadajmi 6519 ID934 - A bug was encountered but not in Vector, which doesn't have bugs
2021-07-21T08:45:24.631Z <6>2 2021-07-21T08:45:24.631Z for.com Karimmove 9308 ID196 - You're not gonna believe what just happened
2021-07-21T08:45:24.641Z <67>1 2021-07-21T08:45:24.641Z names.com meln1ks 1245 ID984 - We're gonna need a bigger boat
2021-07-21T08:45:24.651Z <143>2 2021-07-21T08:45:24.651Z we.net jesseddy 310 ID814 - There's a breach in 

8. Regex Query
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html

In [2]:
query = {
  "query": {
    "regexp": {
      "hostname": {
        "value": "up.*",
        "flags": "ALL",
        "max_determinized_states": 10000,
        "rewrite": "constant_score"
      }
    }
  }
}

res = client.search(index=indice, body=query)

print("Got %d Hits:" % res['hits']['total']['value'])
for hit in res['hits']['hits']:
    print("%(timestamp)s %(raw)s" % hit["_source"])

Got 10000 Hits:
2021-02-24T23:44:56.200Z <29>1 2021-02-24T23:44:56.200Z up.org shaneIxD 2747 ID774 - There's a breach in the warp core, captain
2021-02-24T23:44:56.250Z <43>1 2021-02-24T23:44:56.250Z up.org devankoshal 3940 ID420 - Pretty pretty pretty good
2021-02-24T23:45:55.395Z <49>2 2021-02-24T23:45:55.395Z up.de benefritz 8523 ID509 - You're not gonna believe what just happened
2021-02-24T23:45:55.435Z <90>2 2021-02-24T23:45:55.435Z up.us shaneIxD 7480 ID833 - We're gonna need a bigger boat
2021-02-24T23:45:55.695Z <122>1 2021-02-24T23:45:55.695Z up.com Karimmove 2258 ID67 - Take a breath, let it go, walk away
2021-02-24T23:45:55.705Z <162>1 2021-02-24T23:45:55.705Z up.com shaneIxD 9784 ID281 - Pretty pretty pretty good
2021-02-24T23:45:55.715Z <59>2 2021-02-24T23:45:55.715Z up.us devankoshal 5168 ID985 - Maybe we just shouldn't use computers
2021-02-24T23:45:55.825Z <135>1 2021-02-24T23:45:55.825Z up.us meln1ks 3570 ID610 - Pretty pretty pretty good
2021-02-24T23:45:55.906Z <104

9. Compount Query https://www.elastic.co/guide/en/elasticsearch/reference/current/compound-queries.html

In [8]:
query = {
   "query": {
      "bool" : {
         "must" : {
            "term" : { "hostname" : "random.net" }
         },
         "should": {
            "term" : { "application" : "ahmadajmi" }
         },
         "minimum_should_match" : 1,
         "boost" : 1.0
      }
   }
}


res = client.search(index=indice, body=query)

print("Got %d Hits:" % res['hits']['total']['value'])
for hit in res['hits']['hits']:
    print("%(timestamp)s %(raw)s" % hit["_source"])

Got 191 Hits:
2021-07-21T08:45:29.551Z <61>1 2021-07-21T08:45:29.551Z random.net ahmadajmi 4404 ID651 - Maybe we just shouldn't use computers
2021-07-21T08:45:31.291Z <64>1 2021-07-21T08:45:31.291Z random.net ahmadajmi 912 ID281 - You're not gonna believe what just happened
2021-07-21T08:45:35.461Z <93>2 2021-07-21T08:45:35.461Z random.net ahmadajmi 506 ID579 - Pretty pretty pretty good
2021-07-21T08:45:35.871Z <172>1 2021-07-21T08:45:35.871Z random.net ahmadajmi 9431 ID339 - There's a breach in the warp core, captain
2021-07-21T08:45:36.941Z <172>2 2021-07-21T08:45:36.941Z random.net ahmadajmi 4072 ID499 - A bug was encountered but not in Vector, which doesn't have bugs
2021-07-21T08:45:44.261Z <18>1 2021-07-21T08:45:44.261Z random.net ahmadajmi 6458 ID241 - We're gonna need a bigger boat
2021-07-21T08:45:45.531Z <65>2 2021-07-21T08:45:45.531Z random.net ahmadajmi 9131 ID106 - Great Scott! We're never gonna reach 88 mph with the flux capacitor in its current state!
2021-07-21T08:45:50

10. Count aggregation

In [9]:
query = {
   "aggs":{
      "version_count":{
         "value_count":{
            "field":"version"
         }
      }
   }
}


res = client.search(index=indice, body=query )

print("Got %d Hits:" % res['hits']['total']['value'])

res['aggregations']

Got 10000 Hits:


{'version_count': {'value': 50458}}

11. Cardinality aggregation

In [10]:
query = {
  "aggs": {
    "my-agg-name": {
      "cardinality": {
        "field": "priority"
      }
    }
  }
}
    
res = client.search(index=indice, body=query, scroll='100m', size=10000)

print("Got %d Hits:" % res['hits']['total']['value'])

print(res['aggregations'])

Got 51165 Hits:
{'my-agg-name': {'value': 191}}
