[Feat] Use-cases]: Monitoring OPC UA with Netdata #562

shyamvalsan · 2022-08-31T08:53:35Z

Problem

Netdata cannot currently monitor OPC UA servers or related metrics (tags)
Current solutions for monitoring health, performance and usage of industrial automation systems are rigid and difficult to manage

Description

OPC UA is an open, industry independent, secure connectivity framework for industrial automation data. OPC UA is designed for use across industries for myriad customers across various industrial sectors.

Industrial plants have a large variety of machines and sensors which need to be monitored for safety, maintenance and operational efficiency. Easy and efficient access to this data will improve the R&D efficiency of the companies operating these plants by a considerable factor. Maintenance teams will be able to develop more efficient maintenance plans and Process engineers will be able to optimize their production lines, also ML and AI use-cases will become feasible with access to high fidelity reliable monitoring data.

There should be a Netdata collector that can connect to OPC UA server(s) and collect all the associated metric information (tags) from it.

Here's some useful links to get started:

Importance

really want

Value proposition

Opens up a new market niche for Netdata - there are thousands of companies who operate industrial automation systems/PLCs and if Netdata can offer a simple, flexible, feature rich and cost effective way to monitor these systems/machines there is a potential for a lot of connected nodes in the future.

shyamvalsan · 2022-08-31T11:58:00Z

This feature was requested by a user, here's some feedback from a discussion I had with them.

Works as IS/IT coordinator in automation (large international manufacturer of heavy trucks), has many manufacturing plants, manufacturing components such as engines, gearboxes as well as assembly lines.

Has large amounts (~400) of CNC machines, heat treatment furnaces (lots of sensors for temperature, pressure atmosphere chemical composition, oil baths etc.) , robots, etc. Most equipment based on Siemens PLCs, though other brands exists of course.

Current tools to fetch and analyze machine process data too difficult to use and maintain. Eg: Kepware OPC proxy to send data to a data lake/database. In order to do this, we have to configure the Kepware proxy with each specific signal, data type and how and where to send it. This is something production engineers can't do themselves, but have to request from our internal IT department to do.

Trying to find better ways to provide actual machine data so that we can be much more agile. But also provide maintenance department with better information so they can do predictive/condition based maintenance instead of time/schedule based maintenance.

Goals are to be able to provide information to process engineers so they can optimize their production machines/lines and part quality, and provide maintenance department with enough information so they can develop much more efficient maintenance plans. Machine learning and AI needs as much information as possible to be effective.

Monitoring would have to be done remotely over ethernet. Need a node that can fetch the data and deliver to a parent. Possibly need several such nodes since there are several hundreds of machines and a machine can have 1000-30000 "tags" that could be monitored. The collector should be able to access several OPC-UA servers (machines), otherwise we'd need a swarm of collectors which would need more resources and would be harder to maintain as a infrastructure.

cc: @ktsaou @cakrit @sashwathn @amalkov @ralphm

amalkov · 2022-08-31T18:58:26Z

I believe this is a good opportunity to step in into the manufacturers ecosystems. The outcome of this work can be a paid support plan. It would be good to analyse the effort and implementation complexity.

Probably we just need to implement couple of collected and let it go, to be driven by the community, to validate the need.

shyamvalsan · 2022-09-01T08:08:21Z

If we build the collector and have a guide to using it - we could test the waters by sharing it with https://www.reddit.com/r/PLC/ and see how the community receives it.

ilyam8 · 2022-09-06T14:17:55Z

@thiagoftsm can you share your thought before starting to implement something? atm I have 0 understanding of what OPC UA is and what the ways to collect metrics are, but I googled go opc ua and found https://github.com/gopcua/opcua.

thiagoftsm · 2022-09-06T15:23:19Z

@thiagoftsm can you share your thought before starting to implement something? atm I have 0 understanding of what OPC UA is and what the ways to collect metrics are, but I googled go opc ua and found https://github.com/gopcua/opcua.

Thank you for the link @ilyam8 ! As soon I finish eBPF stuff I am doing right now, I will share data and details about what we can do 🤝 .

thiagoftsm · 2022-09-07T19:53:47Z

@shyamvalsan the Python examples you used are not async examples, instead we will have to use async version https://github.com/FreeOpcUa/opcua-asyncio of OPC UA.

I know we will write with go, I am only calling attention that OPC servers have two modes.

thiagoftsm · 2022-09-08T01:12:40Z

@shyamvalsan about the OPC UA metrics, it looks like that to get everything from the server is not recommended, because protocol was not designed for this, as you can see here, and here.

thiagoftsm · 2022-09-12T13:01:50Z

Hello,

Last week I finished the work with python to understand how OPC UA works (Server, client, protocol). This week I am shifting to go, because python library exposed in OP has limitations that do not allow us to get all metrics we need, and of course the plugin with be written with other library.

During the python development I observed that:

Number of metrics we can collect is huge according documentation, and documentation is not showing all possible namespaces that user can have.
Metrics are delivered like dictionary as you can see in this example.
@shyamvalsan right now I am considering that we will use some values from namespace = 0 and probably we will use more metrics from namespaces with ids two or higher, if somehow users wanna collect everything probably we would need a dashboard per PLC, because some Siemens PLC has 30000 values.
There few issues to be addressed before to start development:
- Collect real data (I will get this with our user).
- How are we going to organize namespaces on dashboard?
- We cannot assume that all servers will have the same values set in namespace zero, but we know the variables that can be there, what are the values we are going to plot?
- Netdata cannot be installed on all hardware that run OPC servers, how are we going to organize data collected from different PLC?
- @stelfrag is there any prevision to remove the current limitation to store and retrieve thousands of metrics from our database?

Best regards!

shyamvalsan · 2022-09-12T13:16:11Z

@thiagoftsm

Regarding namespaces it appears only 0 and 1 can be known in advance by Netdata. The rest is up to the user to configure if they want to monitor.

I was thinking that namespaces should be correlated to jobs, so that each namespace will have a separate section in Netdata to themselves.

Regarding collecting everything and whether will be too many metrics, is 30000 the list of possible values or list of actual useful values that a user would want to monitor continually? I think the agent should figure out a way to ignore constant metrics or empty "tags" and that in practice the number may be lower per PLC (but this is just a hypothesis of mine and could be wrong)
IMO multiple PLC should be treated as separate instances and data coming from them should be aggregated under a namespace on composite charts.

thiagoftsm · 2022-09-12T13:49:39Z

@shyamvalsan after I discuss with users your points I will bring another update.

thiagoftsm · 2022-09-13T01:22:41Z

During the tests I reach a OPC UA server that does not allow to query all Nodes, considering this scenario the safest option looks like to query IDS that are always present. The whole list is present in this link with prefix UA_NS0ID.

thiagoftsm · 2022-09-19T15:26:27Z

Since last message I ran different tests with different OPC servers and a specific PLC emulator developed by microsoft, for this last I was running it with following arguments:

docker run --rm -it -p 50000:50000 -p 8080:8080 --name opcplc mcr.microsoft.com/iotedge/opc-plc:latest --pn=50000 --autoaccept --sph --sn=5 --sr=10 --st=uint --fn=5 --fr=1 --ft=uint --ctb --scn --lid --lsn --ref --gn=5 --ut --aa --to

When I requested all variables for the microsoft PLC I got this result using the python library, because GO client does not allow me to connect with any server to require all nodes (ns0;i=84):

bash-5.1$ go run examples/read/read.go -endpoint opc.tcp://localhost:50000 -node 'ns=0;i=84'
Status not OK: The attribute is not supported for the specified Node. StatusBadAttributeIDInvalid (0x80350000)

As we discussed in our meeting, I am going to send an e-mail for our user requesting a real environment to test, and I will also report the issue in gopcua repo,

Forza-tng · 2022-10-06T18:11:37Z

Hi, I just wanted to chime in on the value proposition. At my company we have lots and lots of UA capable devices (CNC machines, robots, heat treatment, and other manufacturing equipment), and although there are plenty of comersial tools to gather data off these, they are usually focused to drive MES and ERP systems, or gather specific data.

The standard tools work good when we know exactly what data/signals we need. Then it is a matter of selecting the correct source and sending it to the right recipient/system.

My interest here is to find better ways to broadly gather data, visualise how it (the data) looks like, and provide ways to quickly look through thousands of signals/data sources. Netdata is very capable and can easilly graph throusands of metrics in an easy to use interface.

My goals are several.

Provide engineers easy access to their machines' data so they can check performance and quality. This data is important from a warranty point of view, but also from a capability and performance point of view
Provide maintenance departments with metrics that can help them change from mostly static maintenance scheduling to condition based matenance.
Provide detailed metrics for data scientists's use. There are many universities that we work with that want to do research in the automation field, but lack of data makes many research projects difficult to perform.
Develop machine learning tools that can better predict the resulting quality on produced parts based on the machinetool status.
.. and more =)

thiagoftsm · 2023-03-03T17:43:26Z

@ilyam8 and @shyamvalsan I am adding here an example from a Demo IOT environment. As you can see the majority of the metrics are not defined and we won't use them.

Right now my expectations are that in a real environment, metrics not related to server will be listed with a different namespace(ns=2 or higher) and we will focus our collection on them.

shyamvalsan added needs triage feature request Use Cases labels Aug 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat] Use-cases]: Monitoring OPC UA with Netdata #562

[Feat] Use-cases]: Monitoring OPC UA with Netdata #562

shyamvalsan commented Aug 31, 2022 •

edited

Loading

shyamvalsan commented Aug 31, 2022

amalkov commented Aug 31, 2022

shyamvalsan commented Sep 1, 2022

ilyam8 commented Sep 6, 2022

thiagoftsm commented Sep 6, 2022

thiagoftsm commented Sep 7, 2022 •

edited

Loading

thiagoftsm commented Sep 8, 2022

thiagoftsm commented Sep 12, 2022 •

edited

Loading

shyamvalsan commented Sep 12, 2022

thiagoftsm commented Sep 12, 2022

thiagoftsm commented Sep 13, 2022

thiagoftsm commented Sep 19, 2022

Forza-tng commented Oct 6, 2022

thiagoftsm commented Mar 3, 2023

[Feat] Use-cases]: Monitoring OPC UA with Netdata #562

[Feat] Use-cases]: Monitoring OPC UA with Netdata #562

Comments

shyamvalsan commented Aug 31, 2022 • edited Loading

Problem

Description

Importance

Value proposition

shyamvalsan commented Aug 31, 2022

amalkov commented Aug 31, 2022

shyamvalsan commented Sep 1, 2022

ilyam8 commented Sep 6, 2022

thiagoftsm commented Sep 6, 2022

thiagoftsm commented Sep 7, 2022 • edited Loading

thiagoftsm commented Sep 8, 2022

thiagoftsm commented Sep 12, 2022 • edited Loading

shyamvalsan commented Sep 12, 2022

thiagoftsm commented Sep 12, 2022

thiagoftsm commented Sep 13, 2022

thiagoftsm commented Sep 19, 2022

Forza-tng commented Oct 6, 2022

thiagoftsm commented Mar 3, 2023

shyamvalsan commented Aug 31, 2022 •

edited

Loading

thiagoftsm commented Sep 7, 2022 •

edited

Loading

thiagoftsm commented Sep 12, 2022 •

edited

Loading