Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sighting protocol support to add in vast? #638

adulau opened this issue Nov 7, 2019 · 1 comment


Copy link

@adulau adulau commented Nov 7, 2019

Is your feature request related to a problem? Please describe.

Sighting is a common techniques used in threat intelligence platform to sight specific attributes/indicators. We will release in the next version of MISP 2.4.118 a generic service where you can add custom sighting server. The query protocol is documented and there is a prototype sighting server.

Describe the solution you'd like

vast is providing a fast-lookup data-structure which could be used a source of sighting. It would be great to have a sighting functionality in vast to be able for MISP users to query the information/network flow stored such as IP addresses seen or alike.

Describe alternatives you've considered

Another alternative to have a misp-module to query vast directly but that's more intrusive than a simple sighting lookup.

@mavam mavam added the feature 🎁 label Nov 7, 2019

This comment has been minimized.

Copy link

@mavam mavam commented Nov 7, 2019

This is a great idea. I read the SightingDB RFC today think VAST is good fit here. In principle, there are two modes of operating that could make sense, and maybe this could be some feedback for the RFC.

  1. Provide approximate answers: since VAST maintains an additional index to the base data, it can answer a query for "have you seen value X" from the index alone. However, not all index structures are exact. Some may yield false positives. But the error is one-sided, i.e., there would be only the chance of over-counting. On our roadmap, we have quantification of the error. One of reporting this could be via a measure of confidence or variance. I'm not sure if this goes too far, but if results in the SightingsDB format could look like this:

       "value": "",
       "count": 578391,
       "confidence": 0.8,

    In this format, also purely probabilistic backends could report answers that users can interpret in a meaningful way.

  2. Provide exact answers: depending on the volume of sighting queries, computing exact counts may not be prohibitive. In this case, the index lookup would yield a candidate set of events that would have to be scanned to weed out false positives.

Other considerations

  • If the index lookup would yield exact answers, everything works already out of the box. For example, for IP addresses that would the case in VAST, but not for opaque IDs, such as a community ID flow hash.

  • It would be great to qualify the type of the value. Is it an IP address? A URL? A file hash? A hash of something else? MISP already has categories and types, so adding that to the query would allow VAST to map the MISP to type to all the relevant event types in a search. For example, specifying ip-src would make it possible to restrict the search to Zeek's field and Suricata's src_ip field. To support this, we could add a new field type in addition to value.

  • Compound attributes and MISP objects in general. How do you compose sightings for two values? If value A yields a count of 42, and value B a count of 7, then 49 would only be an upper bound. However, the backend may have the ability to stich together the result in a more meaningful manner. This might be a topic for a future version of the RFC, but I think it's important to keep in mind, because for larger counts, the error be become significant. For timestamps, min(A,B) and max(A,B) would be natural though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants
You can’t perform that action at this time.