Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with logstash 2.3 and coordinates returned by geoip filter #4961

Closed
arizonawayfarer opened this issue Mar 31, 2016 · 31 comments
Closed
Assignees

Comments

@arizonawayfarer
Copy link

I've run into an issue using the geoip filter in logstash 2.3 where the plugin is returning valid coordinates after looking up an IP address, but when logstash tries to send the data to elastic the coordinates have been changed to an invalid number. I added some additional logging to the geoip filter so I could see what was getting sent back to logstash.

When this happens, the invalid latitude that logstash tries to send to elastic is the same across all events until I restart logstash. After a restart the latitude changes to a new invalid value.

So this is getting returned to logstash from the geoip filter:

{:level=>:error, "ip"=>"REDACTED", "country_code2"=>"US", "country_code3"=>"USA", "country_name"=>"United States", "continent_code"=>"NA", "region_name"=>"MA", "city_name"=>"Cambridge", "postal_code"=>"02142", "latitude"=>42.362599999999986, "longitude"=>-71.0843, "dma_code"=>506, "area_code"=>617, "timezone"=>"America/New_York", "real_region_name"=>"Massachusetts", "location"=>[-71.0843, 42.362599999999986]}`

And then this is what happens when logstash tries to send to elastic:

Failed action.  {:status=>400, :action=>["index", {:_id=>nil, :_index=>"firewall-2016.03.30", :_type=>"syslog", :_routing=>nil}, 2016-03-30T20:24:01.940Z 10.1.0.9 <182>Mar 30 2016 13:24:01: %ASA-6-302013: Built outbound TCP connection 2218122914 for outside:REDACTED/80 (REDACTED/80) to inside:REDACTED/7180 (REDACTED/35875)
], :response=>{"create"=>{"_index"=>"firewall-2016.03.30", "_type"=>"syslog", "_id"=>"AVPJMpNujqG9ypkgPRCU", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"illegal latitude value [269.82439041137695] for src_ip_coords"}}}}, :level=>:warn}

I can't for the life of me figure out where the invalid latitude is coming from. I've removed actual IP addresses to protect the innocent, but the src_ip field contains a valid public IP address that is successfully filtered by geoip in logstash 2.2.1 :)

The same data and logstash configuration works fine in 2.2.1, this is something new that presented with 2.3

@ph
Copy link
Contributor

ph commented Mar 31, 2016

illegal latitude value [269.82439041137695]

This mean the value is out of bound with what ES want for a geocoords value, which is -90,90, and -180,180. So I think the geoip is returning a wrong value for one of the IP, but I would expect this IP to have the same behavior in LS 2.2.1 and LS 2.3.

I have a few questions:

  1. Are you using the default database for geoip?
  2. Are you using the same input source for both 2.2.1 and 2.3.0?
  3. The two traces that you provide don't have the same location, the first one has a valid location: [-71.0843, 42.362599999999986]} but ES response clearly show an invalid geopoint. illegal latitude value[269.82439041137695]`
  4. Can you share the IP? I think it might be a problem with the geoip data itself, since this data was updated with 2.3.1.
  5. Can you share your configuration?

I think this is a rogue IP and the geoip return a bad result for it.

@arizonawayfarer
Copy link
Author

I am not using the default database for geoip, I'm using the most recent one provided by maxmind. I want to remind you that this database works perfectly fine on 2.2.1.

You are correct that the traces I provided show two different locations - that's the problem. The geoip filter is returning a valid latitude and longitude to logstash, and when logstash tries to send the location to elastic it has been changed to an invalid value.

The IP is irrelevant, I promise. The same data works fine in 2.2.1. I'll spare you my entire config:

      grok
      {
        break_on_match => true
        match => [
          "message", "%{CISCOFW106001}",
          "message", "%{CISCOFW106006_106007_106010}",
          "message", "%{CISCOFW106014}",
          "message", "%{CISCOFW106015}",
          "message", "%{CISCOFW106021}",
          "message", "%{CISCOFW106023}",
          "message", "%{CISCOFW106100}",
          "message", "%{CISCOFW110002}",
          "message", "%{CISCOFW302010}",
          "message", "%{CISCOFW302013_302014_302015_302016}",
          "message", "%{CISCOFW302020_302021}",
          "message", "%{CISCOFW305011}",
          "message", "%{CISCOFW313001_313004_313008}",
          "message", "%{CISCOFW313005}",
          "message", "%{CISCOFW402117}",
          "message", "%{CISCOFW402119}",
          "message", "%{CISCOFW419001}",
          "message", "%{CISCOFW419002}",
          "message", "%{CISCOFW500004}",
          "message", "%{CISCOFW602303_602304}",
          "message", "%{CISCOFW710001_710002_710003_710005_710006}",
          "message", "%{CISCOFW713172}",
          "message", "%{CISCOFW733100}"
        ]
      }
    }

      if "" in [src_ip]
      {
            geoip {
                source => "src_ip"
                target => "src_ip_geoip"
                database => "C:\logstash\maxmind\GeoLiteCity.dat"
                add_field => ["src_ip_coords", "%{[src_ip_geoip][longitude]}"]
                add_field => ["src_ip_coords", "%{[src_ip_geoip][latitude]}"]
            }      
      } 

208.67.222.222


{:level=>:error, "ip"=>"208.67.222.222", "country_code2"=>"US", "country_code3"=>"USA", "country_name"=>"United States", "continent_code"=>"NA", "region_name"=>"CA", "city_name"=>"San Francisco", "postal_code"=>"94107", "latitude"=>37.7697, "longitude"=>-122.39330000000001, "dma_code"=>807, "area_code"=>415, "timezone"=>"America/Los_Angeles", "real_region_name"=>"California", "location"=>[-122.39330000000001, 37.7697]}

Failed action.  {:status=>400, :action=>["index", {:_id=>nil, :_index=>"firewall-2016.03.31", :_type=>"syslog", :_routing=>nil}, 2016-03-31T22:36:59.893Z 10.1.0.9 <182>Mar 31 2016 15:36:59: %ASA-6-302016: Teardown UDP connection 2233065380 for outside:208.67.222.222/53 to inside:10.1.2.102/56490 duration 0:00:00 bytes 300
], :response=>{"create"=>{"_index"=>"firewall-2016.03.31", "_type"=>"syslog", "_id"=>"AVPO0qzKjqG9ypkgk8p9", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"illegal latitude value [269.96450901031494] for src_ip_coords"}}}}, :level=>:warn}

In this example, 208.68.222.222 is the source IP address. The geoip filter returned valid coordinates, and when logstash tried to send to elastic the coordinates seem to have been changed?

The exact same data, maxmind database and logstash config works fine in 2.2.1

@ph
Copy link
Contributor

ph commented Apr 1, 2016

I did a bit more investigation with this config: https://gist.github.com/dbb64f216f010c5dbef09a1419b720dc

The rubydebug output of the event is a bit different between the two version, one appear to be a string and the other a float.

~/e/s/logstash-2.2.2 ❯❯❯ echo "208.67.222.222" | bin/logstash -f ../geoip.conf
  1 input {
Settings: Default pipeline workers: 4
Logstash startup completed
{
          "message" => "208.67.222.222",
         "@version" => "1",
       "@timestamp" => "2016-04-01T00:49:27.488Z",
             "host" => "sashimi",
     "src_ip_geoip" => {
                      "ip" => "208.67.222.222",
           "country_code2" => "US",
           "country_code3" => "USA",
            "country_name" => "United States",
          "continent_code" => "NA",
             "region_name" => "CA",
               "city_name" => "San Francisco",
             "postal_code" => "94107",
                "latitude" => 37.7697,
               "longitude" => -122.39330000000001,
                "dma_code" => 807,
               "area_code" => 415,
                "timezone" => "America/Los_Angeles",
        "real_region_name" => "California",
                "location" => [
            [0] -122.39330000000001,
            [1] 37.7697
        ]
    },
    "src_ip_coords" => [
        [0] -122.39330000000001,
        [1] 37.7697
    ]
}
Logstash shutdown completed
~/e/s/logstash-2.3.0 ❯❯❯ echo "208.67.222.222" | bin/logstash -f ../geoip.conf                                                                                                                                ⏎
Settings: Default pipeline workers: 4
Pipeline main started
[1] pry(#<LogStash::Filters::Ruby>)>
{
       "@timestamp" => 2016-04-01T00:51:09.344Z,
    "src_ip_coords" => [
        [0] "-122.39330000000001",
        [1] "37.7697"
    ],
     "src_ip_geoip" => {
                "timezone" => "America/Los_Angeles",
                      "ip" => "208.67.222.222",
                "latitude" => 37.7697,
               "area_code" => 415,
          "continent_code" => "NA",
               "city_name" => "San Francisco",
           "country_code2" => "US",
            "country_name" => "United States",
                "dma_code" => 807,
           "country_code3" => "USA",
             "region_name" => "CA",
                "location" => [
            [0] -122.39330000000001,
            [1] 37.7697
        ],
        "real_region_name" => "California",
             "postal_code" => "94107",
               "longitude" => -122.39330000000001
    },
         "@version" => "1",
             "host" => "sashimi",
          "message" => "208.67.222.222"
}
Pipeline main has been shutdown
stopping pipeline {:id=>"main"}
``

@ph
Copy link
Contributor

ph commented Apr 1, 2016

I had no problem serializing the data for with the provided IP, Can you try my configuration with your data? Any other place in your configuration that mutate the src_ip_coords field?

@qk4l
Copy link

qk4l commented Apr 1, 2016

I'm not sure if it's the same problem but geoip filter also broken after upgrade to 2.3
Error:

{:timestamp=>"2016-04-01T13:28:53.906000+0300", :message=>"Failed action. ", :status=>400, :action=>["index", {:_id=>nil, :_index=>"logstash-vpnssl-2016.04.01", :_type=>"vpnssl", :_routing=>nil}, 2016-04-01T10:28:53.194Z vpnssl.example.com <134> 2016-04-01T13:26:23+04:00  PulseSecure: localtime=2016-04-01 13:26:23 vpnnode=ive src=188.134.25.207 user=USERNAME realm="AD" roles="ROLE_Example" id=NWC23508 userAgent= msg="Key Exchange number 2 occurred for user with NCIP 192.168.X.X "], :response=>{"create"=>{"_index"=>"logstash-vpnssl-2016.04.01", "_type"=>"vpnssl", "_id"=>"AVPRXlXjToXUiHrxD7C4", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [src_ip_geoip.continent_code]", "caused_by"=>{"type"=>"number_format_exception", "reason"=>"For input string: \"EU\""}}}}, :level=>:warn}

geoip filter

        if [src_ip] {
            geoip {
              source => "src_ip"
              target => "geoip"
              database => "/etc/logstash/GeoLiteCity.dat"
              add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
              add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}"  ]
            }
        }

Test Result

echo "188.134.25.207" | /opt/logstash/bin/logstash -f test.conf 
Settings: Default pipeline workers: 8
Pipeline main started
{
       "message" => "188.134.25.207",
    "@timestamp" => 2016-04-01T10:37:35.534Z,
          "host" => "elk",
         "geoip" => {
        "continent_code" => "EU",
              "location" => [
            [0] 37.616600000000005,
            [1] 55.75
        ],
         "country_code2" => "RU",
         "country_code3" => "RUS",
             "longitude" => 37.616600000000005,
              "latitude" => 55.75,
           "coordinates" => [
            [0] "37.616600000000005",
            [1] "55.75"
        ],
          "country_name" => "Russian Federation",
                    "ip" => "188.134.25.207"
    },
      "@version" => "1"
}

@purbon
Copy link
Contributor

purbon commented Apr 1, 2016

@qk4l I'm sorry for your issues here, are you able to test this VS a fresh ES instance? my guess here would be a wrong mapping. As I understand from:

"caused_by"=>{"type"=>"number_format_exception", "reason"=>"For input string: \"EU\""}

@purbon
Copy link
Contributor

purbon commented Apr 1, 2016

@arizonawayfarer I also did a few test with your issue and was unable to reproduce the error, my current idea is that there is something in your config that might have a bug and actually mutating the input values. In order to provide more clever solution, it would be super nice if you can share the full configuration of yours, like this we will be able to discard any other issues and narrow the problem down.

Thanks a lot for your time and contribution.

@ph ph added the unconfirmed label Apr 1, 2016
@arizonawayfarer
Copy link
Author

input {
 rabbitmq {
    host => "mqhost"
    password => "secret"
    user => "logstash"
    vhost => "logstash"
    queue => "logstash.rawlogs"
    exchange => "logstash.unsecure"
    durable => true
    arguments => {
       "x-queue-master-locator" => "random"
       "x-message-ttl" => 3600000
    }
  }   
}

filter {

  if [type] == "rsa_log"
  {
    mutate {
      remove_tag => [ "unfilteredbeats" ]
    }   
  }

  if [type] == "syslog"
  {
    if [host] == "host1" or [host] == "host2" {
      mutate {
        remove_tag => [ "unfilteredsyslog" ]
        add_tag => [ "sophoswebproxy" ]
      }
    }

    if [host] == "host1" or [host] == "host2" or [host] == "host3" or [host] == "host4" or [host] == "host5" or [host] == "host6" or [host] == "host7" or [host] == "host8" 
    {
      mutate {
        remove_tag => [ "unfilteredsyslog", "_grokparsefailure_sysloginput"]
        add_tag => [ "asa_firewall" ]
      }

      grok
      {
        break_on_match => true
        match => [
          "message", "%{CISCOFW106001}",
          "message", "%{CISCOFW106006_106007_106010}",
          "message", "%{CISCOFW106014}",
          "message", "%{CISCOFW106015}",
          "message", "%{CISCOFW106021}",
          "message", "%{CISCOFW106023}",
          "message", "%{CISCOFW106100}",
          "message", "%{CISCOFW110002}",
          "message", "%{CISCOFW302010}",
          "message", "%{CISCOFW302013_302014_302015_302016}",
          "message", "%{CISCOFW302020_302021}",
          "message", "%{CISCOFW305011}",
          "message", "%{CISCOFW313001_313004_313008}",
          "message", "%{CISCOFW313005}",
          "message", "%{CISCOFW402117}",
          "message", "%{CISCOFW402119}",
          "message", "%{CISCOFW419001}",
          "message", "%{CISCOFW419002}",
          "message", "%{CISCOFW500004}",
          "message", "%{CISCOFW602303_602304}",
          "message", "%{CISCOFW710001_710002_710003_710005_710006}",
          "message", "%{CISCOFW713172}",
          "message", "%{CISCOFW733100}"
        ]
      }
    }

    if [type] == "snmptrap"
    {
            de_dot {
                nested => true
            }
            mutate {
                remove_field => [ "CISCO-CIDS-MIB::cidsGeneralLocalTime", "CISCO-CIDS-MIB::cidsGeneralUTCTime" ]
            }
    }

    if [host] == "host1" or [host] == "host2"
    {
      mutate {
        remove_tag => [ "unfilteredsyslog" ]
        add_tag => [ "sophosemailappliance" ]
      }    
    }

    if "sophoswebproxy" in [tags] {
      grok {
            break_on_match => false
            match => {
                'message' => '(sxl=%{DATA:sxl} h=%{DATA:h} )?u="%{DATA:username}" s=%{DATA:statuscode} X=%{DATA:x} t=%{DATA:unixtime} T=%{DATA:T} Ts=%{DATA:Ts} act=%{DATA:actioncode} cat="%{DATA:uricategory}" app="%{DATA:app}" rsn=%{DATA:rsn} threat="%{DATA:detectedthreat}" type="%{DATA:mimetype}" ctype="%{DATA:ctype}" sav-ev=%{DATA:savev} sav-dv=%{DATA:savdv} uri-dv=%{DATA:uridv} cache=%{DATA:cache} in=%{DATA:in} out=%{DATA:out} meth=%{DATA:httpmethod} ref="%{DATA:referralurl}" ua="%{DATA:useragent}" req="%{DATA:request}" dom="%{DATA:domain}" filetype="%{DATA:filetype}" rule="%{DATA:rule}" filesize=%{DATA:filesize} axtime=%{DATA:axtime} fttime=%{DATA:fttime} scantime=%{DATA:scantime} src_cat="%{DATA:srccat}" labs_cat="%{DATA:labscat}" dcat_prox="%{DATA:dcatprox}" target_ip="%{DATA:targetip}" labs_rule_id="%{DATA:labsruleid}" reqtime=%{DATA:reqtime} adtime=%{DATA:adtime} ftbypass=%{DATA:ftbypass} os=%{DATA:os} authn=%{DATA:authn} auth_by=%{DATA:authby} dnstime=%{DATA:dnstime} quotatime=%{DATA:quotatime} sandbox=%{DATA:sandbox}'
            }
        }
        date {
            locale => "en"
            timezone => "UTC"
            match => ["timestamp", "MMM dd HH:mm:ss", "MMM  dd HH:mm:ss"]
        }
        if [targetip] != "-" {
            geoip {
                source => "targetip"
                target => "geoip"
                database => "C:\logstash\maxmind\GeoLiteCity.dat"
                add_field => ["coords", "%{[geoip][longitude]}"]
                add_field => ["coords", "%{[geoip][latitude]}"]
            }
        }  
    }      
  }
  else if [type] == "wineventlog"
  {

    if [event_id] == 5136 and "groupPolicyContainer" in [message]
    {
      mutate {
            add_tag => [ "groupPolicyChange"  ]
      }  
    }

    if [event_id] == 4624
    {
      grok {
        break_on_match => true
        match => {
          'message' => '%{DATA:event_message}\n\nSubject:\n\tSecurity ID:\t\t%{DATA:security_id}\n\tAccount Name:\t\t%{DATA:subject_accountname}\n\tAccount Domain:\t\t%{DATA:subject_accountdomain}\n\tLogon ID:\t\t%{DATA:subject_logonid}\n\nLogon Type:\t\t\t%{INT:logon_type}\n\n(Impersonation Level:\t\t%{WORD:impersonation_level}\n\n)?New Logon:\n\tSecurity ID:\t\t%{DATA:logon_securityid}\n\tAccount Name:\t\t%{DATA:logon_accountname}\n\tAccount Domain:\t\t%{DATA:logon_accountdomain}\n\tLogon ID:\t\t%{DATA:logon_logonid}\n\tLogon GUID:\t\t%{DATA:logon_logonguid}\n\nProcess Information:\n\tProcess ID:\t\t%{DATA:process_processid}\n\tProcess Name:\t\t%{DATA:process_processname}\n\nNetwork Information:\n\tWorkstation Name:\t\t*%{DATA:networkinfo_workstationname}\n\tSource Network Address:\t%{DATA:network_address}\n\tSource Port:\t\t%{DATA:network_sourceport}\n\nDetailed Authentication Information:\n\tLogon Process:\t\t%{DATA:auth_logonprocess}\n\tAuthentication Package:\t%{DATA:auth_authpackage}\n\tTransited Services:\t%{DATA:auth_transitedservices}\n\tPackage Name \(NTLM only\):\t%{DATA:auth_packagename}\n\tKey Length:\t\t%{INT:auth_keylength}'
        }
      }
    }

    if [event_id] == 4634
    {
      grok {
        break_on_match => true
        match => {
          'message' => '%{DATA:event_message}\n\nSubject:\n\tSecurity ID:\t\t%{DATA:security_id}\n\tAccount Name:\t\t%{DATA:subject_accountname}\n\tAccount Domain:\t\t%{DATA:subject_accountdomain}\n\tLogon ID:\t\t%{DATA:subject_logonid}\n\nLogon Type:\t\t\t%{INT:logon_type}'
        }
      }
    }  
    if [event_id] == 5136
    {
        grok{
        break_on_match => true
        match => {
            'message' => '%{GREEDYDATA:event_message}\n\t\nSubject:\n\tSecurity ID:\t\t%{GREEDYDATA:subject_securityid}\n\tAccount Name:\t\t%{GREEDYDATA:subject_accountname}\n\tAccount Domain:\t\t%{GREEDYDATA:subject_accountdomain}\n\tLogon ID:\t\t%{GREEDYDATA:subject_logonid}\n\nDirectory Service:\n\tName:\t%{GREEDYDATA:directoryservice_name}\n\tType:\t%{GREEDYDATA:directoryservice_type}\n\t\nObject:\n\tDN:\t%{GREEDYDATA:object_dn}\n\tGUID:\t%{GREEDYDATA:object_guid}\n\tClass:\t%{GREEDYDATA:object_class}\n\t\nAttribute:\n\t\LDAP Display Name:\t%{GREEDYDATA:attribute_ldap_displayname}\n\t\Syntax \(OID\):\t%{GREEDYDATA:attribute_syntax_oid}\n\tValue:\t%{GREEDYDATA:attribute_value}\n\t\nOperation:\n\tType:\t%{GREEDYDATA:operation_type}\n\tCorrelation ID:\t%{GREEDYDATA:operation_correlationid}\n\tApplication Correlation ID:\t%{GREEDYDATA:operation_application_correlationid}'
            }
        }
    }

  mutate {
      remove_tag => [ "unfilteredbeats" ]
    }     
  }
  else if [type] == "exchangemessagetrack"
  {
    csv {
      columns => [ "datetime","clientip","clienthostname","serverip","serverhostname","sourcecontext","connectorid","source","eventid","internalmessageid","messageid","recipientaddress","recipientstatus","totalbytes","recipientcount","relatedrecipientaddress","reference","messagesubject","senderaddress","returnpath","messageinfo","directionality","tenantid","originalclientip","originalserverip","customdata" ]
    }

  } 
}

#Secondary filter, if we broke from the first one due to hitting a match.
filter
{
  if [type] == "syslog"
  {
    if "asa_firewall" in [tags]
    {
      if [reason] == "Failover primary closed"
      {
        drop {}
      }
      if "" in [src_ip]
      {
            geoip {
                source => "src_ip"
                target => "src_ip_geoip"
                database => "C:\logstash\maxmind\GeoLiteCity.dat"
                add_field => ["src_ip_coords", "%{[src_ip_geoip][longitude]}"]
                add_field => ["src_ip_coords", "%{[src_ip_geoip][latitude]}"]
            }      
      } 

      if "" in [dst_ip]
      {
            geoip {
                source => "dst_ip"
                target => "dst_ip_geoip"
                database => "C:\logstash\maxmind\GeoLiteCity.dat"
                add_field => ["dst_ip_coords", "%{[dst_ip_geoip][longitude]}"]
                add_field => ["dst_ip_coords", "%{[dst_ip_geoip][latitude]}"]
            }      
      } 

      if "" in [src_xlated_ip]
      {
            geoip {
                source => "src_xlated_ip"
                target => "src_xlated_ip_geoip"
                database => "C:\logstash\maxmind\GeoLiteCity.dat"
                add_field => ["src_xlated_ip_coords", "%{[src_xlated_ip_geoip][longitude]}"]
                add_field => ["src_xlated_ip_coords", "%{[src_xlated_ip_geoip][latitude]}"]
            }      
      } 

    }
  }
}

output {


  if [type] == "syslog"
  {
    if "sophoswebproxy" in [tags] {
        elasticsearch {
            hosts => ["logstore01:9200","logstore02:9200"]
            index => "sophosproxy-%{+YYYY.MM.dd}"
        }
    }  

    if "asa_firewall" in [tags] {
        elasticsearch {
            hosts => ["logstore01:9200","logstore02:9200"]
            index => "firewall-%{+YYYY.MM.dd}"
        }
    }

    if "sophosemailappliance" in [tags] {
        elasticsearch {
            hosts => ["logstore01:9200","logstore02:9200"]
            index => "sophosemailappliance-%{+YYYY.MM.dd}"
        }
    }    

    if "unfilteredsyslog" in [tags] {
        elasticsearch {
            hosts => ["logstore01:9200","logstore02:9200"]
            index => "unknownsyslog-%{+YYYY.MM.dd}"
        }
    }

    if [type] == "aixsyslog" {
        elasticsearch {
            hosts => ["logstore01:9200","logstore02:9200"]
            index => "aixsyslog-%{+YYYY.MM.dd}"
        }
    }         
  }

  if [type] == "wineventlog" {
        elasticsearch {
            hosts => ["logstore01:9200","logstore02:9200"]
            index => "windowseventlog-%{+YYYY.MM.dd}"
        }

    if "groupPolicyChange" in [tags]
    {
      rabbitmq {
                user => 'logstash'
                password => 'super_secret'
                durable => true
                exchange => 'logstash.unsecure'
                exchange_type => 'topic'
                host => 'PI-MQ'
                port => 5672
                vhost => 'logstash'
          key => 'gpochange'
        }
    }
  }

  if [type] == "snmptrap" {
        elasticsearch {
            hosts => ["logstore01:9200","logstore02:9200"]
            index => "generalsnmp-%{+YYYY.MM.dd}"
        }
  }

  if [type] == "exchangemessagetrack"
  {
        elasticsearch {
            hosts => ["logstore01:9200","logstore02:9200"]
            index => "exchange-messagetracking-%{+YYYY.MM.dd}"
        }   
  }

  if [type] == "elasticsearch_log"
  {
        elasticsearch {
            hosts => ["logstore01:9200","logstore02:9200"]
            index => "elasticsearch-serverlog-%{+YYYY.MM.dd}"
        }  
  }

  if [type] == "esb_service_log"
  {
        elasticsearch {
            hosts => ["logstore01:9200","logstore02:9200"]
            index => "esb-servicelog-%{+YYYY.MM.dd}"
        }   
  }

  if [type] == "logstash_log"
  {
        elasticsearch {
            hosts => ["logstore01:9200","logstore02:9200"]
            index => "logstash-serverlog-%{+YYYY.MM.dd}"
        }    
  }

  if [type] == "messagequeue_log"
  {
        elasticsearch {
            hosts => ["logstore01:9200","logstore02:9200"]
            index => "messagequeue-serverlog-%{+YYYY.MM.dd}"
        }    
  }  

  if [type] == "rsa_log"
  {
        elasticsearch {
            hosts => ["logstore01:9200","logstore02:9200"]
            index => "rsa-serverlog-%{+YYYY.MM.dd}"
        }    
  } 

  if "unfilteredbeats" in [tags] {
        elasticsearch {
            hosts => ["logstore01:9200","logstore02:9200"]
            index => "unknownbeats-%{+YYYY.MM.dd}"
        }    
  }

}

@arizonawayfarer
Copy link
Author

Here's the config for the logstash instance that collects the messages and puts them into MQ

input {
    beats {
        port => 5045
    tags => [ "beats" ]
    }
  syslog {
        port => 514
        type => "syslog"
    }
  snmptrap {
    community => [ "secret" ]
    type => "snmptrap"
    codec => plain {
                    charset => "BINARY"
    }    
    yamlmibdir => "C:\logstash\vendor\bundle\jruby\1.9\gems\snmp-1.2.0\data\ruby\snmp\mibs"
    port => 162
  }  
}

filter {
  if [type] == "syslog" {
    mutate {
      add_tag => [ "unfilteredsyslog" ]
    }  
  }
}

output {
        rabbitmq {
            user => 'logstash'
            password => 'secret'
            durable => true
            exchange => 'logstash.unsecure'
            exchange_type => 'topic'
            host => 'mqhost'
            port => 5672
            vhost => 'logstash'
        }
}

@arizonawayfarer
Copy link
Author

I guess I don't understand how it can be a configuration issue when this same config works perfect with logstash 2.2

@purbon
Copy link
Contributor

purbon commented Apr 4, 2016

I don't say is a config issue, but might be (I guess so) it might be a bug in a plugin used in your config. just thinking out loud as we had problems trying to reproduce 👍

@purbon
Copy link
Contributor

purbon commented Apr 4, 2016

thanks a lot for your patience and help on trying to narrow this down!

@ph
Copy link
Contributor

ph commented Apr 4, 2016

A quick look at the configuration, I don't see anything suspect. Will try more testing with it ASAP.

@arizonawayfarer
Copy link
Author

Here's a list of my installed plugins:

[logproc] C:\Users\username>c:\logstash\bin\plugin list
io/console not supported; tty will not be manipulated
logstash-codec-collectd
logstash-codec-dots
logstash-codec-edn
logstash-codec-edn_lines
logstash-codec-es_bulk
logstash-codec-fluent
logstash-codec-graphite
logstash-codec-json
logstash-codec-json_lines
logstash-codec-line
logstash-codec-msgpack
logstash-codec-multiline
logstash-codec-netflow
logstash-codec-oldlogstashjson
logstash-codec-plain
logstash-codec-rubydebug
logstash-filter-anonymize
logstash-filter-checksum
logstash-filter-clone
logstash-filter-csv
logstash-filter-date
logstash-filter-de_dot
logstash-filter-dns
logstash-filter-drop
logstash-filter-fingerprint
logstash-filter-geoip
logstash-filter-grok
logstash-filter-json
logstash-filter-kv
logstash-filter-metrics
logstash-filter-multiline
logstash-filter-mutate
logstash-filter-ruby
logstash-filter-sleep
logstash-filter-split
logstash-filter-syslog_pri
logstash-filter-throttle
logstash-filter-urldecode
logstash-filter-useragent
logstash-filter-uuid
logstash-filter-xml
logstash-input-beats
logstash-input-couchdb_changes
logstash-input-elasticsearch
logstash-input-eventlog
logstash-input-exec
logstash-input-file
logstash-input-ganglia
logstash-input-gelf
logstash-input-generator
logstash-input-graphite
logstash-input-heartbeat
logstash-input-http
logstash-input-http_poller
logstash-input-imap
logstash-input-irc
logstash-input-jdbc
logstash-input-kafka
logstash-input-log4j
logstash-input-lumberjack
logstash-input-pipe
logstash-input-rabbitmq
logstash-input-redis
logstash-input-s3
logstash-input-snmptrap
logstash-input-sqs
logstash-input-stdin
logstash-input-syslog
logstash-input-tcp
logstash-input-twitter
logstash-input-udp
logstash-input-unix
logstash-input-xmpp
logstash-input-zeromq
logstash-output-cloudwatch
logstash-output-csv
logstash-output-elasticsearch
logstash-output-email
logstash-output-exec
logstash-output-file
logstash-output-ganglia
logstash-output-gelf
logstash-output-graphite
logstash-output-hipchat
logstash-output-http
logstash-output-irc
logstash-output-juggernaut
logstash-output-kafka
logstash-output-lumberjack
logstash-output-nagios
logstash-output-nagios_nsca
logstash-output-null
logstash-output-opentsdb
logstash-output-pagerduty
logstash-output-pipe
logstash-output-rabbitmq
logstash-output-redis
logstash-output-s3
logstash-output-sns
logstash-output-sqs
logstash-output-statsd
logstash-output-stdout
logstash-output-tcp
logstash-output-udp
logstash-output-xmpp
logstash-output-zeromq
logstash-patterns-core

@arizonawayfarer
Copy link
Author

Wanted to post an update, with 2.3.1 it looks like valid coordinates are being sent - however elastic is still returning a 400. I can't find the invalid coordinates this time in the message, but elastic is complaining about them in the src_ip_coords field. Come to think of it, I'm not sure I checked the elastic log last when 2.3 was failing, and I might need to punch myself. This may be an issue with elastic?

Another theory I have is that the coordinates are being encapsulated in quotes, and that could be throwing elastic off?

I'm running elasticsearch 2.3.1

Coordinates being sent to elastic:

"src_ip_coords":["-71.0843","42.362599999999986"]

Stack trace from elastic:

[2016-04-11 10:40:06,797][DEBUG][action.bulk              ] [elasticnode] [firewall-2016.04.11][0] failed to execute bulk item (index) index {[firewall-2016.04.11][syslog][AVQGaLfticsWAvm8qMxv], source[{"message":"<182>Apr 11 2016 10:39:59: %ASA-6-106015: Deny TCP (no connection) from 23.211.98.73/443 to REDACTED/15062 flags FIN ACK  on interface outside\n","@version":"1","@timestamp":"2016-04-11T17:39:59.525Z","type":"syslog","host":"10.1.0.9","tags":["asa_firewall"],"priority":0,"severity":0,"facility":0,"facility_label":"kernel","severity_label":"Emergency","action":"Deny","protocol":"TCP","policy_id":"no connection","src_ip":"23.211.98.73","src_port":"443","dst_ip":"REDACTED","dst_port":"15062","tcp_flags":"FIN ACK","interface":"outside\n","src_ip_geoip":{"ip":"23.211.98.73","country_code2":"US","country_code3":"USA","country_name":"United States","continent_code":"NA","region_name":"MA","city_name":"Cambridge","postal_code":"02142","latitude":42.362599999999986,"longitude":-71.0843,"dma_code":506,"area_code":617,"timezone":"America/New_York","real_region_name":"Massachusetts","location":[-71.0843,42.362599999999986]},"src_ip_coords":["-71.0843","42.362599999999986"],"dst_ip_geoip":{"ip":"65.89.28.194","country_code2":"US","country_code3":"USA","country_name":"United States","continent_code":"NA","region_name":"AZ","city_name":"Phoenix","postal_code":"85016","latitude":33.50909999999999,"longitude":-112.0265,"dma_code":753,"area_code":602,"timezone":"America/Phoenix","real_region_name":"Arizona","location":[-112.0265,33.50909999999999]},"dst_ip_coords":["-112.0265","33.50909999999999"]}]}
MapperParsingException[failed to parse]; nested: IllegalArgumentException[illegal latitude value [269.82439041137695] for src_ip_coords];
    at org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:154)
    at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:309)
    at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:529)
    at org.elasticsearch.index.shard.IndexShard.prepareCreateOnPrimary(IndexShard.java:506)
    at org.elasticsearch.action.index.TransportIndexAction.prepareIndexOperationOnPrimary(TransportIndexAction.java:215)
    at org.elasticsearch.action.index.TransportIndexAction.executeIndexRequestOnPrimary(TransportIndexAction.java:224)
    at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:326)
    at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:119)
    at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:68)
    at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.doRun(TransportReplicationAction.java:639)
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
    at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:279)
    at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:271)
    at org.elasticsearch.shield.transport.ShieldServerTransportService$ProfileSecuredRequestHandler.messageReceived(ShieldServerTransportService.java:180)
    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75)
    at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:376)
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.IllegalArgumentException: illegal latitude value [269.82439041137695] for src_ip_coords
    at org.elasticsearch.index.mapper.geo.GeoPointFieldMapper.parse(GeoPointFieldMapper.java:123)
    at org.elasticsearch.index.mapper.geo.BaseGeoPointFieldMapper.parsePointFromString(BaseGeoPointFieldMapper.java:505)
    at org.elasticsearch.index.mapper.geo.BaseGeoPointFieldMapper.parse(BaseGeoPointFieldMapper.java:473)
    at org.elasticsearch.index.mapper.DocumentParser.parseObjectOrField(DocumentParser.java:309)
    at org.elasticsearch.index.mapper.DocumentParser.parseArray(DocumentParser.java:371)
    at org.elasticsearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:254)
    at org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:122)

@arizonawayfarer
Copy link
Author

I think that the coordinates being encapsulated in quotes is it. In 2.2.1 the field values containing coordinates aren't being encapsulated in quotes.

from 2.2.1:

 "src_ip_coords": [-71.0843,42.362599999999986],

@arizonawayfarer
Copy link
Author

If I convert the offending field to a float using a mutate block, it goes into elastic just fine. I think this pretty much confirms the quoting issue?

    geoip {
        source => "src_ip"
        target => "src_ip_geoip"
        database => "C:\logstash\maxmind\GeoLiteCity.dat"
        add_field => ["src_ip_coords", "%{[src_ip_geoip][longitude]}"]
        add_field => ["src_ip_coords", "%{[src_ip_geoip][latitude]}"]
    }

       mutate
       {
         convert => { "src_ip_coords" => "float" }
       } 

@ph
Copy link
Contributor

ph commented Apr 12, 2016

@arizonawayfarer sound like it, I dont think we have made significant change in the geoip filter between the two releases. This might be a side effect of the java event.

Did you try to use 2.3.1 to see if you still have the problem?

@arizonawayfarer
Copy link
Author

Yep, same thing happens in 2.3.1. The problem isn't in the geoip filter. I think that the problem is whatever is taking the values returned by geoip and converting them into strings when the new fields are added.

It doesn't look like the geoip filter is actually adding the new fields - something else is doing that. And it's talking the floats returned by geoip and making them strings.

I don't really know how this stuff works, so I'm just fumbling my way around now trying to understand the workflow. I'm currently looking at string_interpolation.rb - that appears to do conversions?

@ph
Copy link
Contributor

ph commented Apr 12, 2016

Found the issue by doing a diff between the two string_interpolation classes.

      case value
      when nil
        "%{#{@key}}"
      when Array
        value.join(",")
      when Hash
        LogStash::Json.dump(value)
      else
        # Make sure we dont work on the refence of the value
        # The Java Event implementation was always returning a string.
        "#{value}"
      end
    end

We coerce the data in a string.

@ph
Copy link
Contributor

ph commented Apr 12, 2016

We need to add a special case when its a Float/Fixnum.

@ph
Copy link
Contributor

ph commented Apr 12, 2016

@colinsurprenant I would like to have your input on this, We have made a change to the StringInterpolation to make sure we never return a reference #4592, the java implementation also return a string. This change broke the fact that in some cases we were returning the original type.

What should we do in that case, extend the switch case to return the Fixnum/float and everything else as string? We will need to implement the same logic in the java event.

@colinsurprenant
Copy link
Contributor

So if I recap

  • prior to 2.3, string interpolation of numeric values were kept as numeric value, for example

    add_field => ["coords", "%{[geoip][longitude]}"]
    add_field => ["coords", "%{[geoip][latitude]}"]
    

    would yield

    "coords": [-71.0843,42.362599999999986]
    
  • but in >= 2.3 numeric values are converted to string

    "coords": ["-71.0843","42.362599999999986"]
    

So it seems @ph's fix proposal would fix this and respect the previous behaviour.

From a higher level perspective though, I find it confusing that something like "%{[geoip][latitude]}" does not always return a string, since we are using quotes which normally denote a string. Wouldn't it make more sense to just have add_field => ["coords", %{[geoip][latitude]}] ?

@arizonawayfarer
Copy link
Author

I think that from a user perspective, it should be one way or the other. My two cents? I've always kind of thought that the quotes were superfluous when referencing variables in a config. Encapsulating a variable in quotes, or not doing it shouldn't infer the datatype of the variable. The data type of the variable should do that.

I think that having the case statement is the right track, if it's a string return a quoted value. If not, don't. Don't let the config dictate that.

@colinsurprenant
Copy link
Contributor

@arizonawayfarer yeah but it becomes confusing is situation where you use interpolation within strings (which is very common) for examples: add_field => ["foo", "bar-%{[numeric][field]}"] which would yield a string value "bar-123" whereas add_field => ["foo", "%{[numeric][field]}"] would yield a numeric value 123 even though the add_field used string quotes which, I believe, should force a string result. If you did not want to force a string conversion you'd use add_field => ["foo", %{[numeric][field]}]and the result would be of the original type no?

@colinsurprenant
Copy link
Contributor

and I believe what I described in my previous comment is consistent with most/all string interpolation implementations in different programming languages.

Again, for the sake of compatibility with prior behaviour it seems we should move forward with @ph's fix but my question here is more in terms of moving forward and more particularly in the context of an upcoming major version release where breaking compatibility is possible.

@ph ph assigned ph and unassigned purbon Apr 13, 2016
@ph
Copy link
Contributor

ph commented Apr 13, 2016

@colinsurprenant thanks for jumping in, I also find it strange that string interpolation doesn't always return a string. For me, this syntax is similar to what sprintf would do.

I think for 2.3.2, we should apply the fix to make it backward compatible. But for a major release I am not sure we should support that.

II will create the patch for fix the issue in 2.3.X and I will create an issue for breaking compatibility for 5.0 to see how we handle this moving forward.

@colinsurprenant
Copy link
Contributor

note that currently the no-quote string interpolation syntax add_field => ["foo", %{bar}] is not supported and result in a config syntax error

ph added a commit to ph/logstash that referenced this issue Apr 13, 2016
The change introduced with elastic#4592 to deal with reference had
a side effect to break existing configuration when using fieldref accessing a Fixnum or a Float.

The problem was instead of returning the original type , the value
was converted to a string. This string could break some mappings that
were expecting a float.

Fixes elastic#4961
@ph
Copy link
Contributor

ph commented Apr 13, 2016

OK, I have created a patch which should be reviewed soon and created #5114 to see if we drop support for that in 5.0.

@ph
Copy link
Contributor

ph commented Apr 13, 2016

@arizonawayfarer @colinsurprenant I found out this #5114 (comment)

Its a regression yes but from 1.4.2 -> 1.5.X

ph added a commit to ph/logstash that referenced this issue Apr 14, 2016
The change introduced with elastic#4592 to deal with reference had
a side effect to break existing configuration when using fieldref accessing a Fixnum or a Float.

The problem was instead of returning the original type , the value
was converted to a string. This string could break some mappings that
were expecting a float.

Fixes elastic#4961
ph added a commit to ph/logstash that referenced this issue Apr 14, 2016
The change introduced with elastic#4592 to deal with reference had
a side effect to break existing configuration when using fieldref accessing a Fixnum or a Float.

The problem was instead of returning the original type , the value
was converted to a string. This string could break some mappings that
were expecting a float.

Fixes elastic#4961
ph added a commit to ph/logstash that referenced this issue Apr 15, 2016
The change introduced with elastic#4592 to deal with reference had
a side effect to break existing configuration when using fieldref accessing a Fixnum or a Float.

The problem was instead of returning the original type , the value
was converted to a string. This string could break some mappings that
were expecting a float.

Fixes elastic#4961
elasticsearch-bot pushed a commit that referenced this issue Apr 15, 2016
The change introduced with #4592 to deal with reference had
a side effect to break existing configuration when using fieldref accessing a Fixnum or a Float.

The problem was instead of returning the original type , the value
was converted to a string. This string could break some mappings that
were expecting a float.

Fixes #4961

Fixes #5113
elasticsearch-bot pushed a commit that referenced this issue Apr 15, 2016
The change introduced with #4592 to deal with reference had
a side effect to break existing configuration when using fieldref accessing a Fixnum or a Float.

The problem was instead of returning the original type , the value
was converted to a string. This string could break some mappings that
were expecting a float.

Fixes #4961

Fixes #5113
@ph
Copy link
Contributor

ph commented Apr 15, 2016

We will revert this changes for 2.3.2, so using fieldref should keep the type, but we will make changes in for 5.0, please follow #5114 for the details.

@ph ph closed this as completed Apr 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants