Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop using the KONI Ethernet interface #7

Closed
SalvoVirga opened this issue Mar 12, 2016 · 46 comments
Closed

Stop using the KONI Ethernet interface #7

SalvoVirga opened this issue Mar 12, 2016 · 46 comments
Assignees

Comments

@SalvoVirga
Copy link
Member

We currently use the KONI Ethernet port of the cabinet for our communication.
That is an heritage from Khanris's work, we should rather use the Ethernet port used to sync with Sunrise Workbench.
This would allow to integrate FRI commands using that port, and make the whole setup of the stack way easier (without the need to modify settings on the cabinet).

@lisikan
Copy link

lisikan commented Mar 13, 2016

Thank you for answering. I reviewed your WIKI and got your ideas. In fact, what I really care about is that whether the SmartServo joint impedance control could set an extra torque. I want to develop a Joint effort controller in ROS by just mirroring the current joint angle as commands and setting the damping and stiffness to zero. For LBR 4+, there is a function doJntImpedanceControl(JntPosition, newJntStiff, JntDamp, JntAddTorque, false). I don't know whether the iiwa can do this.

@ahundt
Copy link

ahundt commented Mar 14, 2016

Does ROS provide a way to bundle messages within a limited specified range of ports? I think that will be necessary to use the sunrise ethernet port. Also the sunrise ethernet port seems to have approximately an (eyeballed) ~250ms delay. I'm working around that by activating FRI and having a driver that sends a single message packing all configuration in on one of the ports approved in the java documentation, I believe 30010.

@Haas11
Copy link

Haas11 commented Jul 19, 2016

somewhat related to this issue:

We tried dedicating ROS to a usb3 adapter on the Cabinet to free up the KONI port for FRI.
This seemed to work in that contact can be made with a Linux machine and all functionality in the iiwa stack was maintained. However all communications are now capped at roughly 0.4 Hz.

Thinking we might have been too ambitious we decided to revert everything back to how it was before (using the KONI port for ROS). But.... now the frequency cap is still present, still at around 0.4 Hz...

Any thoughts on this?

@SalvoVirga
Copy link
Member Author

Smart attempt! Although I am not sure that the cabinet itself has a USB 3.0 controller, so that might explain the latency.
Why it does stick to that is a mystery to me too...
How are you computing the frequency? You can try to set all the configuration all over again and see (move the OptionNIC to RTOS, reboot, move it to WIN, reboot, make a new project with iiwa_stack, install it, synchronise)

@Haas11
Copy link

Haas11 commented Jul 20, 2016

We managed to fix the issue and the ROS communication via the USB3-Ethernet adapter seems now to be as fast as with the KONI port.

Our cabinet indeed has a USB 3.0 controller. The ports are the blue ones directly beneath the KONI port. However, KUKA had not installed the appropriate drivers. We had to install the Intel drivers “Intel USB 3.0 eXtensible Host Controller Driver” [and also the “Intel Chipset Device Software (INF Update Utility)” as well as the “Intel Management Engine Driver (NUC)” to get rid of all the yellow exclamation marks in the Windows device manager].

Fix of our issue:

What happened is that the installation of the USB3-Ethernet adapter somehow screwed up the configuration of the “Realtime OS Virtual Network Adapter”, which handles the communication between the RTOS/VxWin and the Windows Embedded 7. Resetting that to the configuration visible in the Windows GUI got it back up and running with the low frequency (settings below for reference).

However, that’s only half the trick. The remaining piece of the puzzle was solved here http://www.kuka-labs-forum.com/viewtopic.php?f=44&t=647 “TCP/IP communication in Windows is slow?” with a few additional entries in the registry for the “Realtime OS Virtual Network Adapter” (summary below for reference)

“Realtime OS Virtual Network Adapter” Settings:

Enabled options: Client for Microsoft Networks, File and Printer Sharing for Microsoft Networks, QoS Network Scheduler, Internet Protocol Version 4 (TCP/IPv4)

IPv4 Settings: Fixed IP 192.168.0.1, Subnet mask: 255.255.255.0, Default gateway: 192.168.0.2

Note besides: If need be, you can uninstall this adapter (delete drivers NOT checked) in the device manager and then add it again via “Add legacy hardware>Manually select hardware (advanced)>Network adapters>” and then you’ll see it.

Registry entries:

In “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\Tcpip\Parameters\Interfaces” find the sub-entry that corresponds to the “Realtime OS Virtual Network Adapter”, then add the following 2 new values to that:

  • DWORD (32-bit), name “TcpAckFrequency”, value (hexadecimal) “1”
  • DWORD (32-bit), name “TCPNoDelay”, value (hexadecimal) “1”

@SalvoVirga
Copy link
Member Author

Interesting and good that you managed to make it work. Sadly the main reason behind the idea of stop using the KONI port is to avoid the initial ethernet controller setting, which is something not documented from the manufacturer. This solution seems requiring even more to play around with the embedded system on board of the robot controller, so probably not worthing.
Nevertheless, with this you get to be able to use FRI at the same time, that can be interesting.

@broesdecat
Copy link

broesdecat commented Jul 7, 2017

Hi,

I tried using the X66 port for this (as I need FRI and ROS). The fact that the X66 is slow would not be a problem for us I think (otherwise, I will try out the USB solution mentioned above).

After some changes (finding out that the ROS_IP is found by looking through the network interface was the most difficult part :) ), everything looked configured correctly. RosMonitor gets to the ROS control loop, and I can see the rostopics created by sunrise on my external pc.
However, actual ROS messages do not seem to do anything. The external pc does not receive any messages sunrise is publishing, and commands given do not reach sunrise.

Any clues?

Update: I saw a similar issue mentioned in #39 related to ntp_with_host. That one is set to false for me.

Thx!
Broes

@ahundt
Copy link

ahundt commented Jul 8, 2017

@broesdecat github.com/ahundt/grl works with FRI, ROS, and integrates with iiwa_stack, in case you need that.

@SalvoVirga
Copy link
Member Author

@broesdecat It really depends on what you need and the main point is: what do you need FRI for?
Everything that FRI allows you to do is reading the robot joint position, torques and command the robot in joint position (and I know very few people that managed to use the last feature).

iiwa_stack already allows you for that (and much more), but a slower rate (for commands, capped by the rate of SmartServo - 20ms).

In-depth, why what you try doesn't work: You did a good job, finding how we handle the network interfaces :P and as you said it should be correct, the only problem is that the X66 port is dedicated for the network that handles the sync of the Sunrise project. It is under that 172.31.1.xxx subnet by default and it is some sort of natted network. Doing what you tried you should see something similar to:

Robot IP: fe80:0:0:0:0:5fef: and so on

That is the IP of the network interface that was found and, without the modification to the KONI port that iiwa_stack requires, is the only standard network interface that is found using the code that you looked at. That means that network 172.31.1.xxx is somehow hidden. So the "classical" way of searching for the proper network interface doesn't work and ROSJava tries to connect to that weird IPv6 (which is the virtual interface between WIN and the RTOS).
So you may think this: finding the network doesn't work and I get a wrong IP, so I can just manually put the IP in 172.31.1.xxx in the ROSJava configuration and it will use that! Yes, but no. I tried that and ROSJava (most probably due to the old Netty version that is used there) doesn't find that network and cannot use it. That is why we still need to use the KONI port :(
Other frameworks like ZeroMQ are able to use that network, so I believe the problem is that Netty version (which is quite old). A refactor of ROSJava would be optimal but definitely out of my scope.
Also, atm I don't see the benefit of using FRI to be honest, other than if you REALLY need 1kHz update of the robot joint/torques.

The solution using USB/Ethernet adapter might actually be the easiest one!

About GRL: it works with FRI only for reading the robot joint positions and torques, for commands it still uses SmartServo motions so the command rate is the same. It integrates iiwa_stack in the sense that you can use the URDF model and the MoveIt! package (not with iiwa_hw tho). But anything else from iiwa_stack is not available, only reading of joint positions and torques and commands of joint position with SmartServo (no cartesian commands, the setting of control modes, relative velocity, NTP, and so on). Please, @ahundt feel free to correct me!

@broesdecat
Copy link

Thanks for the suggestions @SalvoVirga and @ahundt, I guess the issue is indeed within the ROSJava part then. Yesterday, I still did a quick attempt to connect with a USB2LAN adapter and I needed no additional setup to get the Ethernet network itself up and running (but I did not yet get to actual ROS testing and communication speed).

Concerning the need for FRI, we are using a kinematic solver that commands in velocities, not positions, to allow more complex movement actions, so realtime communication is really required. As we also want to be able to still use some Sunrise APIs and some ROS APIs for other some of the actions, we also need ROS-Sunrise communication.

Thx!
bores

@SalvoVirga
Copy link
Member Author

Did you already try using the joint velocity messages in iiwa_stack? I never used that myself in a real scenario, but the colleague that added that used it without problems for some visual servoing tasks (at a fairly low speed tbh).

With FRI you might have real-time information on the joint positions, but control would still be limited by SmartServo (no joint velocity control with FRI and even joint control is a mystery there).

@ahundt
Copy link

ahundt commented Jul 8, 2017

@SalvoVirga you can send commands over fri as well, though you'd have to approximate velocities with position commands, though at 1khz that's not too worrisome for most use cases.

@SalvoVirga SalvoVirga added this to the iiwa_stack 1.3 milestone Jul 10, 2017
@SalvoVirga
Copy link
Member Author

After exploring better ROSJava, I finally found out the trick that allows to use the X66 connection ❗ 😀

No changes inside the cabinet required, no more KONI port, same behavior as before and the possibility to add FRI, finally. Performances to be checked but it seems nothing changed.

The current experimental version is in this branch., still to be polished.

@ahundt
Copy link

ahundt commented Jul 10, 2017

@SalvoVirga were you planning to implement FRI on the Java side inside the cabinet or support it on a ROS node running on a computer outside the cabinet?

In the case of a computer outside the cabinet, you may want to consider using the KukaLBRiiwaROSPlugin class. I'm more than happy to help incorporate it directly into iiwa_stack if you are interested.

By the way @broesdecat what solver are you using? I'm using https://github.com/jrl-umi3218/Tasks

@ahundt
Copy link

ahundt commented Jul 10, 2017

@SalvoVirga can you create a PR from your branch? That could make it easier to see & discuss the changes.

@SalvoVirga
Copy link
Member Author

SalvoVirga commented Jul 13, 2017

@ahundt you can check the changes in 56f4e74 for the moment, after all the required changes were very easy, but I had to understand what ROSJava was doing in the background.
I will make a PR once I polish the code a bit and check things twice.
About FRI, to use the actual speed of FRI I guess the best solution is to get the info through the KUKA APIs on an external ROS node and vice versa for commands. So, yeah, something similar to what you did. But I will check all that later, it won't be part of the next iteration (1.3).

@broesdecat
Copy link

@SalvoVirga Looks great! I will certainly check it out next week.
@ahundt We are using the Etasl solver, developed at KUL.

@SalvoVirga
Copy link
Member Author

@broesdecat Please mind that for the moment the code requires the IP of the robot to be 172.31.1.147 and the linux machine 172.31.1.100. You can modify the instances of the first IP currently hardcoded in ROSBaseApplication and ROSSmartServo.java and in the config file for the linux machine IP. Let me know how it works for you :D

@ahundt
Copy link

ahundt commented Jul 13, 2017

Oh you just had to configure a few addresses and ports, which RosJava supported, not too bad. One of those things where you wonder if only it had been more obvious a couple of years ago. :-)

@broesdecat
Copy link

@SalvoVirga I did not check out ROS message passing yet, but I did run into a different issue, where I get interpolation errors on the FRI part halfway during a motion. It only goes wrong when running the FRIJointOverlay as part of a ROSBaseApplication, not when running it independently in a RoboticsAPIApplication. As interpolation errors typically point to some synchronization error, my current hunch is that the ROS threading/Ethernet messages are messing up the FRI/KONI communication. On the other hand, that might be completely wrong, and either way I have no idea on how to check/fix it :)

@ahundt
Copy link

ahundt commented Jul 26, 2017

@broesdecat As a test for the cause, try increasing the number of milliseconds between FRI messages to the maximum.

@broesdecat
Copy link

Apparently, the FRI interpolation errors could have been caused by the joints being close to their limits.
After some more testing:

  • ROS over X66: the subscribers on Sunrise do not receive any messages. Publishing works fine, and also more low-level messages (node and topic info etc.). I checked that the external pc actually sends the messages and they are also ACKed by the cabinet...
  • FRI+ROS: robot only gets to COMMANDING_WAIT FRI state, no longer moves at all (and I did not make any real changes, so not sure why it no longer moves now...). Identical code without ROS parts (so a standard RoboticsAPIApp) has a smooth FRI behavior.

@broesdecat
Copy link

X66 update:
Apparently someone thought it a good idea to have ros_java do network resolution through hostnames, even for an application configured with IP addresses... Workaround is to edit /etc/hosts on all machines (also windows) to know the mapping IP-hostname. Proper solution is probably to have a DNS server running in my local static network, but trying to get that running will not be for now.
I found this by running the ROS part of my Sunrise app on a separate (normal) computer. You then get debug output errors related to unknown host exceptions in Java. Not sure whether that output is actually logged somewhere when running on Sunrise?

@SalvoVirga
Copy link
Member Author

SalvoVirga commented Jul 31, 2017

ROS over X66: the subscribers on Sunrise do not receive any messages. Publishing works fine, and also more low-level messages (node and topic info etc.). I checked that the external pc actually sends the messages and they are also ACKed by the cabinet...

@broesdecat
Did you modify your ~/.bashrc so that it matches the new network IP? For me commands were working normally

@SalvoVirga
Copy link
Member Author

SalvoVirga commented Jul 31, 2017

FRI+ROS: robot only gets to COMMANDING_WAIT FRI state, no longer moves at all (and I did not make any real changes, so not sure why it no longer moves now...). Identical code without ROS parts (so a standard RoboticsAPIApp) has a smooth FRI behavior.

@broesdecat
Which ROS parts do you have in your application? Did you extend ROSBaseApplication? You could try to extend from a stripped down version of ROSBaseApplication, removing iiwaConfiguration and having a ROS node that just connects to a Master and does nothing (like iiwaPublisher but without all the publishers :D), or maybe just comment this line, which is basically all the application does.

If you feel like sharing the code I could give it a look.

@broesdecat
Copy link

@SalvoVirga: I just checked and on Friday I rebased on top of your changes between 1.2.5 and the development branch. Apparently, FRI does not like one of those changes (reverting them gives me back a working FRI part). I will do some trial and error to find where it is going wrong.

@ahundt
Copy link

ahundt commented Aug 1, 2017

woah, what's the trick to run locally?

@broesdecat
Copy link

broesdecat commented Aug 1, 2017

@ahundt, do you mean how to run an app on a local computer? By commenting out all Sunrise code and calling initialize and run (the overrides from RoboticsAPIApplication from a new main method ^^ If you want to do it properly, you can probably mock sunrise interfaces, but that is an order of magnitude more work :)

FRI not doing anything turns out to be because in ROSBaseApplication an asyncmove is started before the main control loop. As only one motion action is allowed at a given moment (and you don't get any notifications about this), the FRI commands were just ignored. @SalvoVirga, can you comment on why you need that asyncMove in the first place?

@ahundt
Copy link

ahundt commented Aug 1, 2017

aha I see, that makes sense, thanks!

@SalvoVirga
Copy link
Member Author

FRI not doing anything turns out to be because in ROSBaseApplication an asyncmove is started before the main control loop. As only one motion action is allowed at a given moment (and you don't get any notifications about this), the FRI commands were just ignored.

Indeed, so far we never used anything else than SmartServo motions so that call wasn't doing any harm. I guess the main point was that also the ROSMonitor app actually has a SmartServo motion running to enable gravity comp mode. ROSMonitor will disappear in iiwa_stack 1.3, so that call can just be moved to methods of ROSSmartServo without any arm.

@broesdecat
Solved that and the X66 problems, how is the FRI integration working right now? Any other major problems? Is the motion execution still troublesome?

@broesdecat
Copy link

@SalvoVirga The only remaining issue I have is that startup regularly fails because of ports that are "in use" (jvm_bind errors), although I don't see any reason for them to be in use. It just happened after a reboot of the cabinet, and just trying to start the app a second time... Might be a rosjava issue.

@broesdecat
Copy link

@SalvoVirga Nevermind. Apparently, which ports are available is configured on the cabinet (port by port...) and only 10 ports were made available while I had 6 nodes (so 12 ports). Everything is working properly now.

@ahundt
Copy link

ahundt commented Sep 21, 2017

@broesdecat did you change your setup to use fewer ports? If you added additional ports how did you open additional ports?

@broesdecat
Copy link

@ahundt, you can open up additional ports in the controller XML file in C:\KRC\Robot\Config\User\Common\KliConfig.xml

@bongomedia
Copy link

It is long time ago that something was written there, but are there any news about the usage of the Sunrise Synchronisation Interface or X66? I tested it and ran into the same things described above. Nothing is published over network but the topics are visible. It looks like when you forget to set the parameter ROS_IP.

@Minimartian
Copy link

Hi guys,

I am reusing this post as we are running ROSSmartServo with the changes of the experimental branch to avoid the use of the KONI interface.
After some start/stop cycles of the ROSSmartServo application I get an exception related to the JVM_Bind, saying that the address is already in use. I think is something similar to what @broesdecat had. The error is probably given by the setXmlRpcBindAddress() function that, after some restarts, doesn't find any available port. Is anyone experiencing the same issue? Is there a way to fix this?

Many thanks,
Bruno

@exo-core
Copy link
Contributor

Same problem here. I have the feeling that Sunrise does not properly remove an application object when you stop it, so that the addresses remain blocked... Restarting the Sunrise Cabinet will resolve the problem...

Alternatively you can generate the addresses dynamically. Create a class AddressGeneration with the following code

package de.tum.in.camp.kuka.ros;

public class AddressGeneration {
	static int address = 30000;
	
	public static int getNewAddress() {
		int newAddress = address;
		address++;
		return newAddress;
	}
}

and exchange all hard-coded addresses for dynamic ones, e.g.

nodeConfConfiguration.setTcpRosBindAddress(BindAddress.newPublic(AddressGeneration.getNewAddress()));

@SalvoVirga
Copy link
Member Author

Yes, restarting the cabinet solves the problem and I also think it's an issue of Sunrise not disposing resources correctly...

@exo-core does your solution completely solve the issue? That would be very cool, although I don't see how that should be differ from the current implementation. At the end the name port numbers would be used.

@exo-core
Copy link
Contributor

exo-core commented Dec 10, 2018

The solution is more like a hacky workaround. As the counter is static you get fresh addresses ones you restart the application. Thus you don't run into this issues over and over again, but the old ones remain 'used'.

Way not perfect, but it saves you a lot of rebooting while developing^^

@SalvoVirga
Copy link
Member Author

I see I see, I missed the keyword. The only problem is then that the number of open ports are quite limited, one risks to assigned unavailable ones at some point.

Let's see if we can come up with something better before merging dev to master.

@exo-core
Copy link
Contributor

Is there a call to force used addressed to get freed? As we have a counter of used addresses that is still available after restart one could easily implement a reset() method to iterate over them and unbind them.

@SalvoVirga
Copy link
Member Author

I don't think that is possible from our side :|
It would be good to have a toy example (not using ROSJava) to check if it's indeed Sunrise not releasing the resources. If that is the case I would annoy the guys at KUKA 😈

@SalvoVirga
Copy link
Member Author

Also, @exo-core and @Minimartian which version of Sunrise are you currently using?
We have the problem quite often using Sunrise 1.11, but on one robot with Sunrise 1.15 I tried to restart ROSSmartServo ~30 times and never had it.

@exo-core
Copy link
Contributor

1.13 here.

@ghost
Copy link

ghost commented Dec 11, 2018

We are using the medical version, Sunrise.OS Med 1.0, which I think is based on the 1.10

@SalvoVirga SalvoVirga removed this from the iiwa_stack 1.3 milestone May 9, 2019
@SalvoVirga
Copy link
Member Author

Finally fixed in #178 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants