Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filebeat httpjson input response.split fails on an array of arrays #30345

Closed
wasserman opened this issue Feb 10, 2022 · 4 comments · Fixed by #33609
Closed

Filebeat httpjson input response.split fails on an array of arrays #30345

wasserman opened this issue Feb 10, 2022 · 4 comments · Fixed by #33609

Comments

@wasserman
Copy link

When trying to split a response of arrays I get the following error:
error(*errors.errorString) *{s: "split was expecting field to be an object"}
It happens here:

if err := s.sendMessage(ctx, root, "", e, ch); err != nil {

Returned from here:

Keep in mind that the default should be array and the flow does go down this path.

My scenario is similar to the example I'm providing here, but I just used the Elastic demo site to make it easy to test and reproduce.

A sample input generated by the following from https://demo.elastic.co/app/dev_tools#/console:

POST _xpack/sql
{
  "query": "SELECT * FROM flights LIMIT 5"
}

Saved the output and served as a static file from local Web server for simplicity:

{
  "columns" : [
    {
      "name" : "AvgTicketPrice",
      "type" : "float"
    },
    {
      "name" : "Cancelled",
      "type" : "boolean"
    },
    {
      "name" : "Carrier",
      "type" : "text"
    },
    {
      "name" : "Dest",
      "type" : "text"
    },
    {
      "name" : "DestAirportID",
      "type" : "text"
    },
    {
      "name" : "DestCityName",
      "type" : "text"
    },
    {
      "name" : "DestCountry",
      "type" : "text"
    },
    {
      "name" : "DestLocation.lat",
      "type" : "text"
    },
    {
      "name" : "DestLocation.lon",
      "type" : "text"
    },
    {
      "name" : "DestRegion",
      "type" : "text"
    },
    {
      "name" : "DestWeather",
      "type" : "text"
    },
    {
      "name" : "DistanceKilometers",
      "type" : "float"
    },
    {
      "name" : "DistanceMiles",
      "type" : "float"
    },
    {
      "name" : "FlightDelay",
      "type" : "boolean"
    },
    {
      "name" : "FlightDelayMin",
      "type" : "long"
    },
    {
      "name" : "FlightDelayType",
      "type" : "text"
    },
    {
      "name" : "FlightNum",
      "type" : "text"
    },
    {
      "name" : "FlightTimeHour",
      "type" : "float"
    },
    {
      "name" : "FlightTimeMin",
      "type" : "float"
    },
    {
      "name" : "Origin",
      "type" : "text"
    },
    {
      "name" : "OriginAirportID",
      "type" : "text"
    },
    {
      "name" : "OriginCityName",
      "type" : "text"
    },
    {
      "name" : "OriginCountry",
      "type" : "text"
    },
    {
      "name" : "OriginLocation.lat",
      "type" : "text"
    },
    {
      "name" : "OriginLocation.lon",
      "type" : "text"
    },
    {
      "name" : "OriginRegion",
      "type" : "text"
    },
    {
      "name" : "OriginWeather",
      "type" : "text"
    },
    {
      "name" : "dayOfWeek",
      "type" : "long"
    },
    {
      "name" : "timestamp",
      "type" : "datetime"
    }
  ],
  "rows" : [
    [
      690.4515,
      false,
      "Logstash Airways",
      "Zurich Airport",
      "ZRH",
      "Zurich",
      "CH",
      "47.464699",
      "8.54917",
      "CH-ZH",
      "Clear",
      261.0023,
      162.1793,
      false,
      0,
      "No Delay",
      "0VA8ULZ",
      0.24166878,
      14.500127,
      "Turin Airport",
      "TO11",
      "Torino",
      "IT",
      "45.200802",
      "7.64963",
      "IT-21",
      "Rain",
      3,
      "2018-05-31T15:14:59.000Z"
    ],
    [
      265.5823,
      false,
      "Kibana Airlines",
      "Zurich Airport",
      "ZRH",
      "Zurich",
      "CH",
      "47.464699",
      "8.54917",
      "CH-ZH",
      "Damaging Wind",
      16323.602,
      10143.016,
      true,
      180,
      "Late Aircraft Delay",
      "6TFDEZN",
      25.671669,
      1540.3002,
      "Melbourne International Airport",
      "MEL",
      "Melbourne",
      "AU",
      "-37.673302",
      "144.843002",
      "SE-BD",
      "Hail",
      1,
      "2018-05-29T08:44:16.000Z"
    ],
    [
      960.86975,
      true,
      "Kibana Airlines",
      "Rajiv Gandhi International Airport",
      "HYD",
      "Hyderabad",
      "IN",
      "17.23131752",
      "78.42985535",
      "SE-BD",
      "Cloudy",
      7044.367,
      4377.167,
      true,
      15,
      "NAS Delay",
      "M05KE88",
      10.033843,
      602.0306,
      "Milano Linate Airport",
      "MI11",
      "Milan",
      "IT",
      "45.445099",
      "9.27674",
      "IT-25",
      "Heavy Fog",
      0,
      "2018-05-28T12:09:35.000Z"
    ],
    [
      479.79047,
      false,
      "Logstash Airways",
      "Sydney Kingsford Smith International Airport",
      "SYD",
      "Sydney",
      "AU",
      "-33.94609833",
      "151.177002",
      "SE-BD",
      "Rain",
      15551.353,
      9663.162,
      false,
      0,
      "No Delay",
      "OHLGFR3",
      17.279282,
      1036.7568,
      "Lester B. Pearson International Airport",
      "YYZ",
      "Toronto",
      "CA",
      "43.67720032",
      "-79.63059998",
      "CA-ON",
      "Clear",
      1,
      "2018-05-29T16:20:44.000Z"
    ],
    [
      648.5135,
      false,
      "JetBeats",
      "Sydney Kingsford Smith International Airport",
      "SYD",
      "Sydney",
      "AU",
      "-33.94609833",
      "151.177002",
      "SE-BD",
      "Hail",
      15913.688,
      9888.308,
      false,
      0,
      "No Delay",
      "PJOSR23",
      13.261407,
      795.6844,
      "Oslo Gardermoen Airport",
      "OSL",
      "Oslo",
      "NO",
      "60.19390106",
      "11.10039997",
      "NO-02",
      "Clear",
      3,
      "2018-05-31T12:46:33.000Z"
    ]
  ]
}

Filebeat input config as follows:

- type: httpjson
  interval: 30s
  config_version: 2
  request.url: http://localhost:8000/output.json
  request.method: GET

  response.split:
    target: body.rows

Tested on Filebeat 7.16.2.

I hope that this is a simple thing to fix.

Bonus points for some smart way to map the columns to rows after such a split so Filebeat can output documents!

Thanks!

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Feb 10, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Feb 11, 2022
@legoguy1000
Copy link
Contributor

legoguy1000 commented Feb 11, 2022

Same issue happens if u have an array of strings. Its caused by the forced casting here,

func toMapStr(v interface{}) (common.MapStr, bool) {
. I added the below to the Switch and it seems to work.

	case string, []interface{}:
		temp := make(map[string]interface{})
		temp["data"] = t
		return common.MapStr(temp), true

Using the example above it produces

{
  "@timestamp": "2022-02-11T21:21:53.654Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "8.1.0"
  },
  "agent": {
    "name": "dev",
    "type": "filebeat",
    "version": "8.1.0",
    "ephemeral_id": "5cf38d3b-78f8-4fca-9225-5220cd794dfd",
    "id": "2f86261e-143b-4c29-bc66-ee43cf4379c7"
  },
  "event": {
    "created": "2022-02-11T21:21:53.654Z"
  },
  "message": "{\"data\":[690.4515,false,\"Logstash Airways\",\"Zurich Airport\",\"ZRH\",\"Zurich\",\"CH\",\"47.464699\",\"8.54917\",\"CH-ZH\",\"Clear\",261.0023,162.1793,false,0,\"No Delay\",\"0VA8ULZ\",0.24166878,14.500127,\"Turin Airport\",\"TO11\",\"Torino\",\"IT\",\"45.200802\",\"7.64963\",\"IT-21\",\"Rain\",3,\"2018-05-31T15:14:59.000Z\"]}",
  "input": {
    "type": "httpjson"
  },
  "test": {
    "data": [
      690.4515,
      false,
      "Logstash Airways",
      "Zurich Airport",
      "ZRH",
      "Zurich",
      "CH",
      "47.464699",
      "8.54917",
      "CH-ZH",
      "Clear",
      261.0023,
      162.1793,
      false,
      0,
      "No Delay",
      "0VA8ULZ",
      0.24166878,
      14.500127,
      "Turin Airport",
      "TO11",
      "Torino",
      "IT",
      "45.200802",
      "7.64963",
      "IT-21",
      "Rain",
      3,
      "2018-05-31T15:14:59.000Z"
    ]
  },
  "ecs": {
    "version": "8.0.0"
  },
  "host": {
    "name": "dev"
  }
}

@tompipe
Copy link

tompipe commented Jun 18, 2022

Was annoyingly hit with this issue too, with json with this structure:

{
   "row_headers":[
      "Date",
      "Time Spent (seconds)"
   ],
   "rows":[
      [
         "2022-06-18T00:00:00",
         123
      ],
      [
         "2022-06-18T00:00:00",
         456
      ]
   ]
}

Was banging my head for a while, as first time using go templates, but eventually I managed to come up with this as a workaround.

- set:
    target: body.rows
    value_type: json		
    value: '[		   
	[[- $last_row := len (slice .last_response.body.rows 1) -]]                               		
	[[- $headers := .last_response.body.row_headers -]]
	[[- range $row_idx, $row := .last_response.body.rows -]]
            {
                [[- $last_cell := len (slice $row 1) -]]     
                [[- range $cell_idx, $cell := $row -]]
                  [[- sprintf "%q : %q" (js (index $headers $cell_idx)) (js $cell) -]]
                  [[- if lt $cell_idx $last_cell -]][[- sprintf ","]][[end]]
                [[end]]
            }
            [[- if lt $row_idx $last_row -]][[- sprintf ","]][[end]]                
	[[end]]
    ]'

Which converts each element in the row array to json objects, with their values keyed by the appropriate header.

I ran into some encoding issues on some of the data, hence the js call to encode the values.

Resulting in body.rows looking something like this

[
    {
       "Date" : "2022-06-18T00:00:00",
       "Time Spent (seconds)" : "123"
    },
    {
       "Date" : "2022-06-18T00:00:00",
       "Time Spent (seconds)" : "456"
    }
]

Hope it helps others before the fix above gets merged and becomes available (or someone points out a better approach 😉 )

Bonus points for some smart way to map the columns to rows after such a split so Filebeat can output documents!

Thanks!

Here is a slightly modified (and not perfect) version, which appears to work with your example @wasserman - do I win the bonus points? 🏆 😄

@jsanz
Copy link
Member

jsanz commented Jul 4, 2022

I managed to come up with this as a workaround.

@tompipe thanks for your workaround! Could you share the rest of the filebeat definition? I understand you use your snippet as a response.transform and without adding any request.split, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants