Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standalone mavcesium uses a lot of resource #34

Closed
fnoop opened this issue Nov 30, 2017 · 22 comments · Fixed by #35
Closed

Standalone mavcesium uses a lot of resource #34

fnoop opened this issue Nov 30, 2017 · 22 comments · Fixed by #35
Assignees
Labels

Comments

@fnoop
Copy link
Member

fnoop commented Nov 30, 2017

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
14994 mav       20   0  427996 112960  12180 R  20.1  6.6  26:34.26 python

112Mb resident memory is a lot for a small embedded computer (for example an rpi zero only has 384mb after graphics). Can this be reduced?

The cpu is presumably down to pymavlink?

@SamuelDudley
Copy link
Member

I have yet to profile anything, but I'm sure we can save some of those resources :)
As a first step you could try increasing the blocking wait time in the select call to minimize any CPU spinning that may be occurring. Try increasing 0.01 to 1.0 and see if it reduces the overall CPU load a bit
https://github.com/SamuelDudley/MAVCesium/blob/master/app/cesium_web_server.py#L172
change to
inputready,outputready,exceptready = select.select([self.connection.control_connection.port],[],[],1.0)

Unfortunately (as you know) pyMAVLink is a bit of a CPU hog. But knowing where the memory and CPU is being used is a good start!

@fnoop
Copy link
Member Author

fnoop commented Dec 1, 2017

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
14994 mav       20   0 1041352 723284   5112 S  20.6 42.4 275:22.41 python

Ouch, that's 724Mb resident after it's been running for a while. Restarting it goes back to low memory usage (40mb) but climbs slowly. Must either be a memory leak, or something is creating resources and releasing previous ones.

@fnoop
Copy link
Member Author

fnoop commented Dec 1, 2017

Changing the wait time as suggested doesn't seem to make any difference.

@SamuelDudley
Copy link
Member

Nasty! Thanks for reporting this. I'm sure I'll be able to track something down...

@SamuelDudley
Copy link
Member

I haven't had a chance to profile the code yet, but looking over it there were a couple of places that needed to be fixed. One being a infinite sized queue with no consumer! =S
The WIP branch with the fixes is here https://github.com/SamuelDudley/MAVCesium/tree/memory_fix
#35

@SamuelDudley
Copy link
Member

Spent pretty much all day on this with no real luck....

Some things to note:

  • Main CPU usage is receiving MAVLink messages
  • PyMAVLink does use a fair slab of memory but remains consistent once all message types have been received (each connection stores the latest instance of each message type).
  • The real memory issues seem to be with tornado. I'm using the latest from pip tornado-4.5.2

Using guppy to try and profile the memory usage:

launching with no websocket connections:

Partition of a set of 843 objects. Total size = 161264 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0     31   4    32488  20     32488  20 dict of
                                             pymavlink.dialects.v20.ardupilotmega.MAVLink_header
     1     28   3    11680   7     44168  27 dict (no owner)
     2    111  13     9768   6     53936  33 __builtin__.weakref
     3     62   7     5512   3     59448  37 array.array
     4      7   1     3408   2     62856  39 types.FrameType
     5      1   0     3352   2     66208  41 dict of pymavlink.dialects.v20.ardupilotmega.MAVLink
     6      1   0     3352   2     69560  43 dict of
                                             pymavlink.dialects.v20.ardupilotmega.MAVLink_camera_fee
                                             dback_message
     7      1   0     3352   2     72912  45 dict of
                                             pymavlink.dialects.v20.ardupilotmega.MAVLink_rc_channel
                                             s_message
     8      1   0     3352   2     76264  47 dict of
                                             pymavlink.dialects.v20.ardupilotmega.MAVLink_sensor_off
                                             sets_message
     9      1   0     3352   2     79616  49 dict of
                                             pymavlink.dialects.v20.ardupilotmega.MAVLink_servo_outp
                                             ut_raw_message
<127 more rows. Type e.g. '_.more' to view.>

Pretty stable around 161264 bytes used...

Adding a connection:

websocket opened!
Partition of a set of 2160 objects. Total size = 433728 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0    281  13    55440  13     55440  13 str
     1     33   2    34584   8     90024  21 dict of tornado.concurrent.Future
     2     31   1    32488   7    122512  28 dict of
                                             pymavlink.dialects.v20.ardupilotmega.MAVLink_header
     3      7   0    23464   5    145976  34 dict of tornado.iostream.IOStream
     4     49   2    21400   5    167376  39 dict (no owner)
     5    121   6    15112   3    182488  42 list
     6     13   1    13624   3    196112  45 dict of tornado.gen.Runner
     7    179   8    13288   3    209400  48 tuple
     8    149   7    13112   3    222512  51 __builtin__.weakref
     9    100   5    12000   3    234512  54 function
<179 more rows. Type e.g. '_.more' to view.>

Partition of a set of 2182 objects. Total size = 435728 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0    281  13    55440  13     55440  13 str
     1     33   2    34584   8     90024  21 dict of tornado.concurrent.Future
     2     31   1    32488   7    122512  28 dict of
                                             pymavlink.dialects.v20.ardupilotmega.MAVLink_header
     3      7   0    23464   5    145976  34 dict of tornado.iostream.IOStream
     4     49   2    21400   5    167376  38 dict (no owner)
     5    172   8    15136   3    182512  42 __builtin__.weakref
     6    121   6    15112   3    197624  45 list
     7     13   1    13624   3    211248  48 dict of tornado.gen.Runner
     8    179   8    13288   3    224536  52 tuple
     9    100   5    12000   3    236536  54 function
<179 more rows. Type e.g. '_.more' to view.>

Memory usage instantly increased to 435728 bytes...
Hitting f5 to reload the page results in a closed and reopened connection:

websocket closed
websocket opened!
Partition of a set of 3577 objects. Total size = 662280 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0     55   2   122920  19    122920  19 dict (no owner)
     1   1164  33   103368  16    226288  34 str
     2    548  15    73312  11    299600  45 list
     3     33   1    34584   5    334184  50 dict of tornado.concurrent.Future
     4     31   1    32488   5    366672  55 dict of
                                             pymavlink.dialects.v20.ardupilotmega.MAVLink_header
     5     25   1    26200   4    392872  59 dict of tornado.gen.Runner
     6      7   0    23464   4    416336  63 dict of tornado.iostream.IOStream
     7    189   5    16632   3    432968  65 __builtin__.weakref
     8    187   5    13856   2    446824  67 tuple
     9    102   3    12240   2    459064  69 function
<181 more rows. Type e.g. '_.more' to view.>

Partition of a set of 3528 objects. Total size = 646952 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0     55   2   122920  19    122920  19 dict (no owner)
     1   1164  33   103368  16    226288  35 str
     2    548  16    73312  11    299600  46 list
     3     33   1    34584   5    334184  52 dict of tornado.concurrent.Future
     4     31   1    32488   5    366672  57 dict of
                                             pymavlink.dialects.v20.ardupilotmega.MAVLink_header
     5      7   0    23464   4    390136  60 dict of tornado.iostream.IOStream
     6    177   5    15576   2    405712  63 __builtin__.weakref
     7    187   5    13856   2    419568  65 tuple
     8     13   0    13624   2    433192  67 dict of tornado.gen.Runner
     9    102   3    12240   2    445432  69 function
<181 more rows. Type e.g. '_.more' to view.> 

Memory usage has increased again while serving no more connections...
I then opened and closed a few more connections and the new memory usage while not serving any clients is 685680 bytes:

Partition of a set of 4917 objects. Total size = 685680 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0   3299  67   286120  42    286120  42 str
     1     39   1   115752  17    401872  59 dict (no owner)
     2    471  10    84280  12    486152  71 list
     3     31   1    32488   5    518640  76 dict of
                                             pymavlink.dialects.v20.ardupilotmega.MAVLink_header
     4    189   4    16632   2    535272  78 __builtin__.weakref
     5     27   1     7560   1    542832  79 dict of tornado.template._Text
     6     31   1     7192   1    550024  80 __builtin__.set
     7     83   2     6424   1    556448  81 tuple
     8     20   0     5600   1    562048  82 dict of tornado.template._Expression
     9     62   1     5512   1    567560  83 array.array
<149 more rows. Type e.g. '_.more' to view.>

So clearly something within the tornado server is not cleaning up after itself...
Possibly related https://stackoverflow.com/questions/47486897/tornado-websocket-server-doesnt-release-ram-memory

@SamuelDudley
Copy link
Member

with regards to CPU usage we can see that most of the time spent by pyMAVLink is parsing the incoming char buffer

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   314                                               def recv_msg_wrapper(self):
   315                                           #         lp_wrapper = lp(self.process_connection_in)
   316                                                   '''message receive routine'''
   317       729         3252      4.5      0.8          self.pre_message()
   318      1458         1683      1.2      0.4          while True:
   319      1458        13895      9.5      3.5              n = self.mav.bytes_needed()
   320      1458        11888      8.2      3.0              s = self.recv(n)
   321      1458         1698      1.2      0.4              numnew = len(s)
   322                                           
   323      1458         1295      0.9      0.3              if numnew != 0:
   324      1458        10525      7.2      2.7                  if self.logfile_raw:
   325                                                               self.logfile_raw.write(str(s))
   326      1458         1397      1.0      0.4                  if self.first_byte:
   327         1           10     10.0      0.0                      self.auto_mavlink_version(s)
   328                                           
   329                                                       # We always call parse_char even if the new string is empty, because the existing message buf might already have some valid packet
   330                                                       # we can extract
   331      1458       283133    194.2     71.6              msg = self.mav.parse_char(s)
   332      1458         6292      4.3      1.6              if msg:
   333       729          800      1.1      0.2                  if self.logfile and  msg.get_type() != 'BAD_DATA' :
   334                                                               usec = int(time.time() * 1.0e6) & ~3
   335                                                               self.logfile.write(str(struct.pack('>Q', usec) + msg.get_msgbuf()))
   336       729        58211     79.9     14.7                  self.post_message(msg)
   337       729          877      1.2      0.2                  return msg
   338                                                       else:
   339                                                           # if we failed to parse any messages _and_ no new bytes arrived, return immediately so the client has the option to
   340                                                           # timeout
   341       729          677      0.9      0.2                  if numnew == 0:
   342                                                               return None

self.mav.parse_char(s) is within the generated dialect code with options for speedup via mavnative but note that mavnative isn't supported for MAVLink2 yet...

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
 10485                                                   def parse_char_wrapper(self, c):
 10486                                                       '''input some data bytes, possibly returning a new message'''
 10487       429         1423      3.3      1.8              self.buf.extend(c)
 10488                                           
 10489       429          856      2.0      1.1              self.total_bytes_received += len(c)
 10490                                           
 10491       429          370      0.9      0.5              if self.native:
 10492                                                           if native_testing:
 10493                                                               self.test_buf.extend(c)
 10494                                                               m = self.__parse_char_native(self.test_buf)
 10495                                                               m2 = self.__parse_char_legacy()
 10496                                                               if m2 != m:
 10497                                                                   print("Native: %s\nLegacy: %s\n" % (m, m2))
 10498                                                                   raise Exception('Native vs. Legacy mismatch')
 10499                                                           else:
 10500                                                               m = self.__parse_char_native(self.buf)
 10501                                                       else:
 10502       429        73533    171.4     91.8                  m = self.__parse_char_legacy()
 10503                                           
 10504       429         1724      4.0      2.2              if m != None:
 10505       214          287      1.3      0.4                  self.total_packets_received += 1
 10506       214         1288      6.0      1.6                  self.__callbacks(m)
 10507                                                       else:
 10508                                                           # XXX The idea here is if we've read something and there's nothing left in
 10509                                                           # the buffer, reset it to 0 which frees the memory
 10510       215          334      1.6      0.4                  if self.buf_len() == 0 and self.buf_index != 0:
 10511                                                               self.buf = bytearray()
 10512                                                               self.buf_index = 0
 10513                                           
 10514       429          270      0.6      0.3              return m

self.__parse_char_legacy() is called in this case as I'm using mavlink v2 (mavnative can't help us!)

Function: __parse_char_legacy_wrapper at line 10521

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
 10521                                                   def __parse_char_legacy_wrapper(self):
 10522                                                       '''input some data bytes, possibly returning a new message (uses no native code)'''
 10523       467          897      1.9      1.2              header_len = HEADER_LEN_V1
 10524       467         1759      3.8      2.3              if self.buf_len() >= 1 and self.buf[self.buf_index] == PROTOCOL_MARKER_V2:
 10525       467          524      1.1      0.7                  header_len = HEADER_LEN_V2
 10526                                                           
 10527       467         1116      2.4      1.4              if self.buf_len() >= 1 and self.buf[self.buf_index] != PROTOCOL_MARKER_V1 and self.buf[self.buf_index] != PROTOCOL_MARKER_V2:
 10528                                                           magic = self.buf[self.buf_index]
 10529                                                           self.buf_index += 1
 10530                                                           if self.robust_parsing:
 10531                                                               m = MAVLink_bad_data(chr(magic), 'Bad prefix')
 10532                                                               self.expected_length = header_len+2
 10533                                                               self.total_receive_errors += 1
 10534                                                               return m
 10535                                                           if self.have_prefix_error:
 10536                                                               return None
 10537                                                           self.have_prefix_error = True
 10538                                                           self.total_receive_errors += 1
 10539                                                           raise MAVError("invalid MAVLink prefix '%s'" % magic)
 10540       467          629      1.3      0.8              self.have_prefix_error = False
 10541       467          833      1.8      1.1              if self.buf_len() >= 3:
 10542       467         1284      2.7      1.6                  sbuf = self.buf[self.buf_index:3+self.buf_index]
 10543       467         1053      2.3      1.4                  if sys.version_info[0] < 3:
 10544       467         1228      2.6      1.6                      sbuf = str(sbuf)
 10545       467         1711      3.7      2.2                  (magic, self.expected_length, incompat_flags) = struct.unpack('BBB', sbuf)
 10546       467          777      1.7      1.0                  if magic == PROTOCOL_MARKER_V2 and (incompat_flags & MAVLINK_IFLAG_SIGNED):
 10547                                                                   self.expected_length += MAVLINK_SIGNATURE_BLOCK_LEN
 10548       467          667      1.4      0.9                  self.expected_length += header_len + 2
 10549       467         1031      2.2      1.3              if self.expected_length >= (header_len+2) and self.buf_len() >= self.expected_length:
 10550       233         2757     11.8      3.5                  mbuf = array.array('B', self.buf[self.buf_index:self.buf_index+self.expected_length])
 10551       233          338      1.5      0.4                  self.buf_index += self.expected_length
 10552       233          251      1.1      0.3                  self.expected_length = header_len+2
 10553       233          269      1.2      0.3                  if self.robust_parsing:
 10554       233          205      0.9      0.3                      try:
 10555       233          386      1.7      0.5                          if magic == PROTOCOL_MARKER_V2 and (incompat_flags & ~MAVLINK_IFLAG_SIGNED) != 0:
 10556                                                                       raise MAVError('invalid incompat_flags 0x%x 0x%x %u' % (incompat_flags, magic, self.expected_length))
 10557       233        59685    256.2     76.6                          m = self.decode(mbuf)
 10558                                                               except MAVError as reason:
 10559                                                                   m = MAVLink_bad_data(mbuf, reason.message)
 10560                                                                   self.total_receive_errors += 1
 10561                                                           else:
 10562                                                               if magic == PROTOCOL_MARKER_V2 and (incompat_flags & ~MAVLINK_IFLAG_SIGNED) != 0:
 10563                                                                   raise MAVError('invalid incompat_flags 0x%x 0x%x %u' % (incompat_flags, magic, self.expected_length))
 10564                                                               m = self.decode(mbuf)
 10565       233          299      1.3      0.4                  return m
 10566       234          216      0.9      0.3              return None

and finally we see most of the time is spent decoding the message buffer into (maybe) a message within self.decode(mbuf)

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
 10625                                                   def decode_wrapper(self, msgbuf):
 10626                                                           '''decode a buffer as a MAVLink message'''
 10627                                                           # decode the header
 10628       136          346      2.5      0.5                  if msgbuf[0] != PROTOCOL_MARKER_V1:
 10629       136          210      1.5      0.3                      headerlen = 10
 10630       136          187      1.4      0.3                      try:
 10631       136          861      6.3      1.2                          magic, mlen, incompat_flags, compat_flags, seq, srcSystem, srcComponent, msgIdlow, msgIdhigh = struct.unpack('<cBBBBBBHB', msgbuf[:headerlen])
 10632                                                               except struct.error as emsg:
 10633                                                                   raise MAVError('Unable to unpack MAVLink header: %s' % emsg)
 10634       136          326      2.4      0.5                      msgId = msgIdlow | (msgIdhigh<<16)
 10635       136          205      1.5      0.3                      mapkey = msgId
 10636                                                           else:
 10637                                                               headerlen = 6
 10638                                                               try:
 10639                                                                   magic, mlen, seq, srcSystem, srcComponent, msgId = struct.unpack('<cBBBBB', msgbuf[:headerlen])
 10640                                                                   incompat_flags = 0
 10641                                                                   compat_flags = 0
 10642                                                               except struct.error as emsg:
 10643                                                                   raise MAVError('Unable to unpack MAVLink header: %s' % emsg)
 10644                                                               mapkey = msgId
 10645       136          225      1.7      0.3                  if (incompat_flags & MAVLINK_IFLAG_SIGNED) != 0:
 10646                                                               signature_len = MAVLINK_SIGNATURE_BLOCK_LEN
 10647                                                           else:
 10648       136          200      1.5      0.3                      signature_len = 0
 10649                                           
 10650       136          274      2.0      0.4                  if ord(magic) != PROTOCOL_MARKER_V1 and ord(magic) != PROTOCOL_MARKER_V2:
 10651                                                               raise MAVError("invalid MAVLink prefix '%s'" % magic)
 10652       136          294      2.2      0.4                  if mlen != len(msgbuf)-(headerlen+2+signature_len):
 10653                                                               raise MAVError('invalid MAVLink message length. Got %u expected %u, msgId=%u headerlen=%u' % (len(msgbuf)-(headerlen+2+signature_len), mlen, msgId, headerlen))
 10654                                           
 10655       136          293      2.2      0.4                  if not mapkey in mavlink_map:
 10656                                                               raise MAVError('unknown MAVLink message ID %s' % str(mapkey))
 10657                                           
 10658                                                           # decode the payload
 10659       136          237      1.7      0.3                  type = mavlink_map[mapkey]
 10660       136          291      2.1      0.4                  fmt = type.format
 10661       136          259      1.9      0.4                  order_map = type.orders
 10662       136          247      1.8      0.4                  len_map = type.lengths
 10663       136          231      1.7      0.3                  crc_extra = type.crc_extra
 10664                                           
 10665                                                           # decode the checksum
 10666       136          197      1.4      0.3                  try:
 10667       136          633      4.7      0.9                      crc, = struct.unpack('<H', msgbuf[-(2+signature_len):][:2])
 10668                                                           except struct.error as emsg:
 10669                                                               raise MAVError('Unable to unpack MAVLink CRC: %s' % emsg)
 10670       136          272      2.0      0.4                  crcbuf = msgbuf[1:-(2+signature_len)]
 10671       136          235      1.7      0.3                  if True: # using CRC extra
 10672       136          368      2.7      0.5                      crcbuf.append(crc_extra)
 10673       136         7309     53.7     10.5                  crc2 = x25crc(crcbuf)
 10674       136          248      1.8      0.4                  if crc != crc2.crc:
 10675                                                               raise MAVError('invalid MAVLink CRC in msgID %u 0x%04x should be 0x%04x' % (msgId, crc, crc2.crc))
 10676                                           
 10677       136          219      1.6      0.3                  sig_ok = False
 10678       136          223      1.6      0.3                  if signature_len == MAVLINK_SIGNATURE_BLOCK_LEN:
 10679                                                               self.signing.sig_count += 1
 10680       136          344      2.5      0.5                  if self.signing.secret_key is not None:
 10681                                                               accept_signature = False
 10682                                                               if signature_len == MAVLINK_SIGNATURE_BLOCK_LEN:
 10683                                                                   sig_ok = self.check_signature(msgbuf, srcSystem, srcComponent)
 10684                                                                   accept_signature = sig_ok
 10685                                                                   if sig_ok:
 10686                                                                       self.signing.goodsig_count += 1
 10687                                                                   else:
 10688                                                                       self.signing.badsig_count += 1
 10689                                                                   if not accept_signature and self.signing.allow_unsigned_callback is not None:
 10690                                                                       accept_signature = self.signing.allow_unsigned_callback(self, msgId)
 10691                                                                       if accept_signature:
 10692                                                                           self.signing.unsigned_count += 1
 10693                                                                       else:
 10694                                                                           self.signing.reject_count += 1
 10695                                                               elif self.signing.allow_unsigned_callback is not None:
 10696                                                                   accept_signature = self.signing.allow_unsigned_callback(self, msgId)
 10697                                                                   if accept_signature:
 10698                                                                       self.signing.unsigned_count += 1
 10699                                                                   else:
 10700                                                                       self.signing.reject_count += 1
 10701                                                               if not accept_signature:
 10702                                                                   raise MAVError('Invalid signature')
 10703                                           
 10704       136          374      2.8      0.5                  csize = struct.calcsize(fmt)
 10705       136          307      2.3      0.4                  mbuf = msgbuf[headerlen:-(2+signature_len)]
 10706       136          262      1.9      0.4                  if len(mbuf) < csize:
 10707                                                               # zero pad to give right size
 10708        59          343      5.8      0.5                      mbuf.extend([0]*(csize - len(mbuf)))
 10709       136          247      1.8      0.4                  if len(mbuf) < csize:
 10710                                                               raise MAVError('Bad message of type %s length %u needs %s' % (
 10711                                                                   type, len(mbuf), csize))
 10712       136          293      2.2      0.4                  mbuf = mbuf[:csize]
 10713       136          220      1.6      0.3                  try:
 10714       136          495      3.6      0.7                      t = struct.unpack(fmt, mbuf)
 10715                                                           except struct.error as emsg:
 10716                                                               raise MAVError('Unable to unpack MAVLink payload type=%s fmt=%s payloadLength=%u: %s' % (
 10717                                                                   type, fmt, len(mbuf), emsg))
 10718                                           
 10719       136          437      3.2      0.6                  tlist = list(t)
 10720                                                           # handle sorted fields
 10721       136          221      1.6      0.3                  if True:
 10722       136          316      2.3      0.5                      t = tlist[:]
 10723       136          412      3.0      0.6                      if sum(len_map) == len(len_map):
 10724                                                                   # message has no arrays in it
 10725      1086         7125      6.6     10.2                          for i in range(0, len(tlist)):
 10726       955         1716      1.8      2.5                              tlist[i] = t[order_map[i]]
 10727                                                               else:
 10728                                                                   # message has some arrays
 10729         5            9      1.8      0.0                          tlist = []
 10730        50          255      5.1      0.4                          for i in range(0, len(order_map)):
 10731        45           70      1.6      0.1                              order = order_map[i]
 10732        45           66      1.5      0.1                              L = len_map[order]
 10733        45          103      2.3      0.1                              tip = sum(len_map[:order])
 10734        45           64      1.4      0.1                              field = t[tip]
 10735        45           71      1.6      0.1                              if L == 1 or isinstance(field, str):
 10736        40           69      1.7      0.1                                  tlist.append(field)
 10737                                                                       else:
 10738         5           13      2.6      0.0                                  tlist.append(t[tip:(tip + L)])
 10739                                           
 10740                                                           # terminate any strings
 10741      1136         4935      4.3      7.1                  for i in range(0, len(tlist)):
 10742      1000        30009     30.0     43.2                      if isinstance(tlist[i], str):
 10743                                                                   tlist[i] = str(MAVString(tlist[i]))
 10744       136          397      2.9      0.6                  t = tuple(tlist)
 10745                                                           # construct the message object
 10746       136          221      1.6      0.3                  try:
 10747       136         3277     24.1      4.7                      m = type(*t)
 10748                                                           except Exception as emsg:
 10749                                                               raise MAVError('Unable to instantiate MAVLink message of type %s : %s' % (type, emsg))
 10750       136          261      1.9      0.4                  m._signed = sig_ok
 10751       136          233      1.7      0.3                  if m._signed:
 10752                                                               m._link_id = msgbuf[-13]
 10753       136          237      1.7      0.3                  m._msgbuf = msgbuf
 10754       136          357      2.6      0.5                  m._payload = msgbuf[6:-(2+signature_len)]
 10755       136          240      1.8      0.3                  m._crc = crc
 10756       136          928      6.8      1.3                  m._header = MAVLink_header(msgId, incompat_flags, compat_flags, mlen, seq, srcSystem, srcComponent)
 10757       136          196      1.4      0.3                  return m

There you have it... So if you have ever wondered where pyMAVLink spends most of its CPU time, wonder no more!

@fnoop
Copy link
Member Author

fnoop commented Dec 5, 2017

re: cpu that's a really good target for improvement, for pymavlink! Worth posting as an issue there?
Agree as suspected it has nothing to do with mavcesium

@fnoop
Copy link
Member Author

fnoop commented Dec 5, 2017

So here is mavcesium memory profiled with no clients attached:
screen shot 2017-12-05 at 07 32 34
The interesting thing to note is the stepped increase in program data memory (and therefore VM memory) where the step time increases over time, almost like a backoff algorithm. Restarting it shows the same thing start again:
screen shot 2017-12-05 at 09 44 21
Now profiling again with the memory_fix branch.

@SamuelDudley
Copy link
Member

At the moment there aren't any changes that I would expect to make a difference in the memory_fix branch... Only a few small changes but nothing to fix what you are experiencing... It's interesting to note the memory climb without any clients connected, in my testing I didn't see anything nearly as pronounced as the graph you have.
What time scale is that graph over minutes or hours?

@SamuelDudley
Copy link
Member

My first thought was that the mavlink connection was keeping some sort of memory based log of received messages, causing the increase usage over time. I did go digging and didn't find any suspicious code... Not to say the issue isn't with pymavlink. I just didn't find it while looking for it.

@fnoop
Copy link
Member Author

fnoop commented Dec 5, 2017

Timescale is along the bottom, first graph is about 8 hours, second is over 6 hours. Interestingly since restarting again from the memory_fix branch it uses less CPU:
screen shot 2017-12-05 at 11 13 08

I'll do a couple more tests - one without the nginx reverse proxy and another one against a very simple dronekit app, to see if the problem exists with just pymavlink.

@SamuelDudley
Copy link
Member

The majority of the CPU usage reduction comes with the reduction in message stream rate... less messages to parse means less CPU used. The issue is that the display becomes a bit 'blocky' with updates. We can tune the message rate to help balance the CPU usage if that would help (or make the rate a launch option?)

@fnoop
Copy link
Member Author

fnoop commented Dec 5, 2017

Ah OK. Memory usage of mavlink-params, which is a very simple dronekit app to manage parameters, has a completely static memory usage. So the problem isn't inherently present in pymavlink.

@SamuelDudley
Copy link
Member

Thanks for testing. I'm pretty sure it's related to tornado but just need to pin down where...

@SamuelDudley
Copy link
Member

SamuelDudley commented Dec 6, 2017

Right... so I'm pretty sure its in pyMAVLink. By commenting out self.connection.control_connection.recv_msg() and replacing it with time.sleep(0.01) the memory leak is gone. Additionally if I put that bit of code back in and don't do anything with the received message the memory leak comes back.
So somewhere in recv_msg() something is growing over time...

@SamuelDudley
Copy link
Member

@fnoop pretty sure I have tracked down the source of the memory leak and plugged it. Could you please test the latest commit in #35 when you have a chance...
Thanks !

@fnoop
Copy link
Member Author

fnoop commented Dec 6, 2017

Oh well done!
screen shot 2017-12-06 at 13 02 39

@fnoop fnoop closed this as completed Dec 6, 2017
@fnoop
Copy link
Member Author

fnoop commented Dec 6, 2017

Is that a bug in pymavlink, or a feature?

@SamuelDudley
Copy link
Member

SamuelDudley commented Dec 6, 2017

I'm going to call it a bug... The behavior we were seeing is the result of a bytearray being extended over and over again.
In the dialect generated code e.g. ardupilotmega.py there is a MAVLink object that stores the connection buffer / bytearray in self.buf and an associated index in self.buf_index.
In the parse_char() method of this object the problem occurs (note I have removed some unused mavnative logic below for clarity):

        def parse_char(self, c):
            '''input some data bytes, possibly returning a new message'''
            self.buf.extend(c)
            self.total_bytes_received += len(c)
            m = self.__parse_char_legacy()
            if m != None:
                self.total_packets_received += 1
                self.__callbacks(m)
            else:
                if self.buf_len() == 0 and self.buf_index != 0:
                    self.buf = bytearray()
                    self.buf_index = 0
            return m

The issue is that every time this code is run there is always a complete message waiting to be decoded. Because of this the final else statement never gets ran and the buffer continues to grow.
Although the logic in that statement self.buf_len() == 0 and self.buf_index != 0 is true every time this method is ran, self.buf and self.buf_index are never able to be reset... personally I think the else could be removed and we should be testing that condition every time that method is called. e.g.:

        def parse_char(self, c):
            '''input some data bytes, possibly returning a new message'''
            self.buf.extend(c)
            self.total_bytes_received += len(c)
            m = self.__parse_char_legacy()
            if m != None:
                self.total_packets_received += 1
                self.__callbacks(m)
            if self.buf_len() == 0 and self.buf_index != 0:
                self.buf = bytearray()
                self.buf_index = 0
            return m

which equivalent to what is now in the MAVCesium server code.

@SamuelDudley
Copy link
Member

pyMAVLink issue raised: ArduPilot/pymavlink#130

@pietrodn
Copy link

pietrodn commented Jul 5, 2018

Did you also observe increased CPU utilization over time?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants