Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esp8266: Memory error with E:M 1584 #1971

Open
Adixy opened this Issue Apr 6, 2016 · 17 comments

Comments

Projects
None yet
5 participants
@Adixy
Copy link

Adixy commented Apr 6, 2016

So, I made a socket server on esp and I'm recv-ing and send-ing data (AF_INET,SOCK_STREAM)
Only 1 client (my pc), and one server on esp, port 8000
It works for some time (approx 3-4 sec), but than it just stops, with no error or anything
It just shows
E:M 1584
on the end of REPL

Is there a fix? Is this a known error? Or is it my undoing?
:D

@peterhinch

This comment has been minimized.

Copy link
Contributor

peterhinch commented Apr 7, 2016

As it happens I have a test running now doing this, on port 12345. This is using firmware built from source yesterday. So far it's working. I suggest you provide more details such as code and firmware build.

@dpgeorge dpgeorge changed the title E:M 1584 esp8266: Memory error with E:M 1584 Apr 7, 2016

@Adixy

This comment has been minimized.

Copy link
Author

Adixy commented Apr 7, 2016

Server (on ESP8266 Huzzah, firmware from before 3days)
import socket,pyb
sig=0
TCP_IP = '192.168.1.9'
TCP_PORT = 8000
BUFFER_SIZE = 80
kom_data=bytearray(80)
i=1
while i==1:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
print("Creating server",TCP_IP,':',TCP_PORT)
s.bind((TCP_IP, TCP_PORT))
print("Server created succesfully")
s.listen(1)
i=2 #This While is unnecessary, but I run it from the REPL so I just do it like this
try:
while True:
conn,addr=s.accept()
print ('Connection address:', addr)
data=conn.recv(BUFFER_SIZE)
print ("received data:", data)
conn.send(data)
conn.close()
pyb.delay(35) # do something with the data
except OSError:
break
s.close()

Client:
from struct import pack

import socket

TCP_IP = '192.168.1.9'
TCP_PORT = 8000
BUFFER_SIZE = 80

br=0
data=None

add=[]
data=[]
kom_data=bytearray(80)
print('Connecting...')
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((TCP_IP, TCP_PORT))
print 'Connected to:',TCP_IP,':',TCP_PORT
while True:

 #do something that creates "add" and "data" from struct.pack

 kom_data=bytearray(add+data)

 print 'sending data...'

 if len(kom_data)==0:
     s.send(bytearray(2))
 else:
     s.send(kom_data)
 print 'sent:',len(kom_data)
 print s.recv(80)

add=[]
data=[]
s.close()
print 'Connecting...'
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((TCP_IP, TCP_PORT))
print 'Connected to:',TCP_IP,':',TCP_PORT

s.close()

The Error was shown on the ESP, and the server was made on ESP IP addr by ESP :D

@Adixy

This comment has been minimized.

Copy link
Author

Adixy commented Apr 7, 2016

Ok, it removes my "Spaces" from comment before
so, on the server side, the "While i==1:" is the main thing. "Try:" is in that "while i==1:", and "While True:" is inside the "Try:"
"s.close()" is not inside "while i==1:"

Client is as it is shown

@Adixy

This comment has been minimized.

Copy link
Author

Adixy commented Apr 7, 2016

I use --> wifi_mode(3) and phy_mode(2)

This is the output, and as it can be seen, on the end there is E:M 1584, and after that is the OSError

Creating server 192.168.1.9 : 8000
Server created succesfully
Connection address: ('192.168.1.5', 58277)
received data: b'\x14\x00\x00\x00\x15\x00\x00\x00\x16\x00\x00\x00\x17\x00\x00\x00\x18\x00\x00\x00\x19\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
48
Connection address: ('192.168.1.5', 58278)
received data: b'\x14\x00\x00\x00\x15\x00\x00\x00\x16\x00\x00\x00\x17\x00\x00\x00\x18\x00\x00\x00\x19\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xe7\x03\x00\x00\x00\x00\x00\x00\xe7\x03\x00\x00\x00\x00\x00\x00'
48
.
.
.
. ##a lot of data for what took about 5-6 sec on todays firmware
.
Connection address: ('192.168.1.5', 58397)
received data: b'\x14\x00\x00\x00\x15\x00\x00\x00\x16\x00\x00\x00\x17\x00\x00\x00\x18\x00\x00\x00\x19\x00\x00\x00\xda\xfe\xff\xff\x18\xfc\xff\xff\xe7\x03\x00\x00\x00\x00\x00\x00\xe7\x03\x00\x00\x00\x00\x00\x00'
48
Connection address: ('192.168.1.5', 58398)
received data: b'\x14\x00\x00\x00\x15\x00\x00\x00\x16\x00\x00\x00\x17\x00\x00\x00\x18\x00\x00\x00\x19\x00\x00\x00z\xfe\xff\xff\x18\xfc\xff\xff\xe7\x03\x00\x00\x00\x00\x00\x00\xe7\x03\x00\x00\x00\x00\x00\x00'
48
Connection address: ('192.168.1.5', 58399)
received data: b'\x14\x00\x00\x00\x15\x00\x00\x00\x16\x00\x00\x00\x17\x00\x00\x00\x18\x00\x00\x00\x19\x00\x00\x00\xeb\xfd\xff\xff\x18\xfc\xff\xff\xe7\x03\x00\x00\x00\x00\x00\x00\xe7\x03\x00\x00\x00\x00\x00\x00'
48
Connection address: ('192.168.1.5', 58400)
received data: b'\x14\x00\x00\x00\x15\x00\x00\x00\x16\x00\x00\x00\x17\x00\x00\x00\x18\x00\x00\x00\x19\x00\x00\x00\x86\xfd\xff\xff<\xfc\xff\xff\xe7\x03\x00\x00\x00\x00\x00\x00\xe7\x03\x00\x00\x00\x00\x00\x00'
48
Connection address: ('192.168.1.5', 58401)
received data: b'\x14\x00\x00\x00\x15\x00\x00\x00\x16\x00\x00\x00\x17\x00\x00\x00\x18\x00\x00\x00\x19\x00\x00\x00\t\xfd\xff\xffg\xfc\xff\xff\xe7\x03\x00\x00\x00\x00\x00\x00\xe7\x03\x00\x00\x00\x00\x00\x00'
48
Connection address: ('192.168.1.5', 58402)
received data: b'\x14\x00\x00\x00\x15\x00\x00\x00\x16\x00\x00\x00\x17\x00\x00\x00\x18\x00\x00\x00\x19\x00\x00\x00\x9f\xfc\xff\xff\x9f\xfc\xff\xff\xe7\x03\x00\x00\x00\x00\x00\x00\xe7\x03\x00\x00\x00\x00\x00\x00'
48
Connection address: ('192.168.1.5', 58403)
received data: b'\x14\x00\x00\x00\x15\x00\x00\x00\x16\x00\x00\x00\x17\x00\x00\x00\x18\x00\x00\x00\x19\x00\x00\x00E\xfc\xff\xff\xda\xfc\xff\xff\xe7\x03\x00\x00\x00\x00\x00\x00\xe7\x03\x00\x00\x00\x00\x00\x00'
48
Connection address: ('192.168.1.5', 58404)
received data: b'\x14\x00\x00\x00\x15\x00\x00\x00\x16\x00\x00\x00\x17\x00\x00\x00\x18\x00\x00\x00\x19\x00\x00\x00\x18\xfc\xff\xff\x1c\xfd\xff\xff\xe7\x03\x00\x00\x00\x00\x00\x00\xe7\x03\x00\x00\x00\x00\x00\x00'
48
Connection address: ('192.168.1.5', 58405)
received data: b'\x14\x00\x00\x00\x15\x00\x00\x00\x16\x00\x00\x00\x17\x00\x00\x00\x18\x00\x00\x00\x19\x00\x00\x00\x18\xfc\xff\xff]\xfd\xff\xff\xe7\x03\x00\x00\x00\x00\x00\x00\xe7\x03\x00\x00\x00\x00\x00\x00'
48
Connection address: ('192.168.1.5', 58406)
received data: b'\x14\x00\x00\x00\x15\x00\x00\x00\x16\x00\x00\x00\x17\x00\x00\x00\x18\x00\x00\x00\x19\x00\x00\x00\x18\xfc\xff\xff\x88\xfd\xff\xff\xe7\x03\x00\x00\x00\x00\x00\x00\xe7\x03\x00\x00\x00\x00\x00\x00'
E:M 1584 #This is frustrating :(
Traceback (most recent call last):
File "", line 13, in
OSError: 12

line 13 is conn.send(data) ---> ECHO

@Adixy

This comment has been minimized.

Copy link
Author

Adixy commented Apr 7, 2016

Please forgive me for commenting this many, but I just want to help in improving the ESP8266 micropython. I just love python

But, to the thing
so I ran the code above. When I received E:M 1584 (further I will just call it em) I started the code again and it received just one info and stopped with the same error, em. I did it few times again, the same thing happened.

fun()
Connection address: ('192.168.1.5', 58673)
received data: b'\x14\x00\x00\x00\x15\x00\x00\x00\x16\x00\x00\x00\x17\x00\x00\x00\x18\x00\x00\x00\x19\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
E:M 1584
Traceback (most recent call last):
File "", line 1, in
File "", line 7, in fun
OSError: 12

fun()
Connection address: ('192.168.1.5', 58677)
received data: b'\x14\x00\x00\x00\x15\x00\x00\x00\x16\x00\x00\x00\x17\x00\x00\x00\x18\x00\x00\x00\x19\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
E:M 1584
Traceback (most recent call last):
File "", line 1, in
File "", line 7, in fun
OSError: 12
fun()
Connection address: ('192.168.1.5', 58681)
received data: b'\x14\x00\x00\x00\x15\x00\x00\x00\x16\x00\x00\x00\x17\x00\x00\x00\x18\x00\x00\x00\x19\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
E:M 1584
Traceback (most recent call last):
File "", line 1, in
File "", line 7, in fun
OSError: 12

After this, it came to my mind, if this is memory error, why not just to run the garbage collector:

gc.mem_free()
16688

gc.collect()

gc.mem_free()
22400

Now I started the code again and it actually went on for some time, the same as it did the first time I started it
but that em again appeared.. after the em again appeared:

gc.mem_free()
21392

So I started the code again and it also went on for approx 6 seconds, and stopped with the same em
But before I ran the code this time, I didn't do gc.collect().
Confusing
First time gc helped, second time I didn't need it
Third time, even I used gc again, it didnt help. I was unable to start it again and to send/recv the data more than once, before the em appeared

I hope this was helpful :D
If it wasn't, sorry for bothering

@peterhinch

This comment has been minimized.

Copy link
Contributor

peterhinch commented Apr 8, 2016

I think there is a firmware issue here. I used the following code, running on a pair of esp8266 devices, and the server does throw this error after a short period of running.
Server:

import socket, pyb, gc
TCP_IP = '0.0.0.0'
TCP_PORT = 8004
BUFFER_SIZE = 80

kom_data=bytearray(BUFFER_SIZE)
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
print("Creating server",TCP_IP,':',TCP_PORT)
s.bind((TCP_IP, TCP_PORT))
print("Server created succesfully")
s.listen(1)
while True:
    gc.collect()
    try:
        conn,addr=s.accept()
        print ('Connection address:', addr)
    except OSError:
        break
    recd = 0
    parts = []
    while recd < BUFFER_SIZE:
        part = conn.recv(BUFFER_SIZE - recd)
        recd += len(part)
        if part == b'':
            raise OSError('Connection lost')
        parts.append(part)
    print("received {} bytes".format(len(b''.join(parts))))
    sent = 0
    while sent < BUFFER_SIZE:
        sent = conn.send(kom_data[sent:])
        if sent == 0:
            raise OSError('Connection lost')
    conn.close()
s.close()

Client

import socket, pyb, gc

TCP_IP = '192.168.0.24'
TCP_PORT = 8004
BUFFER_SIZE = 80

kom_data=bytearray(BUFFER_SIZE)
while True:
    gc.collect()
    print('Connecting...')
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect((TCP_IP, TCP_PORT))
    s.setblocking(True)
    print( 'Connected to:',TCP_IP,':',TCP_PORT)

    print( 'sending data...')
    sent = 0
    while sent < BUFFER_SIZE:
        sent = s.send(kom_data[sent:])
        if sent == 0:
            raise OSError('Connection lost')
    print( 'sent:',len(kom_data))
    recd = 0
    parts = []
    while recd < BUFFER_SIZE:
        part = s.recv(BUFFER_SIZE - recd)
        recd += len(part)
        if part == b'':
            raise OSError('Connection lost')
        parts.append(part)
    print(b''.join(parts))
    s.close()
    pyb.delay(100)

Outcome (on server)

[numerous successful loops omitted]
Connection address: ('192.168.0.23', 27382)
received 80 bytes
E:M 1584
Traceback (most recent call last):
  File "<stdin>", line 30, in <module>
OSError: 12
>>> import micropython
>>> micropython.mem_info()
stack: 1952 out of 8192
GC: total: 24192, used: 4624, free: 19568
 No. of 1-blocks: 60, 2-blocks: 16, max blk sz: 80
>>> import pyb
>>> pyb.info(1)
_text_start=40100000
_text_end=40107f88
_irom0_text_start=40209000
_irom0_text_end=40269f32
_data_start=3ffe8000
_data_end=3ffe83e8
_rodata_start=3ffe83f0
_rodata_end=3ffe8558
_bss_start=3ffe8558
_bss_end=3fff4e78
_heap_start=3fff4e78
_heap_end=3fffc000
qstr:
  n_pool=1
  n_qstr=17
  n_str_data_bytes=161
  n_total_bytes=257
GC:
  24192 total
  4800 : 19392
  1=66 2=17 m=80
GC memory layout; from 3ffee7e0:
ee7e0: hthhhhhthThhDThLBhthhThhhttttthLhhttttthhttttthhttttthhhhhhhhthB
eebe0: hhthThttttthhthttttttthhthhhhhhThtttththhBhhttttttthttttthtttttt
eefe0: ttthhthttttttthtttththhtttthtBhBhhBhht....ht....hthtththhhhh....
ef3e0: ....hthhhh......................................................
ef7e0: .......htttttttttttttttttttttttttttttttttttttttttttttttttttttttt
efbe0: ttttttttttttttttttttttt...............................ht........
effe0: ...............httttttttttttttttttttttttttttttt.................
       (16 lines all free)
f43e0: ........................................
>>> 
@pfalcon

This comment has been minimized.

Copy link
Member

pfalcon commented Apr 9, 2016

Please rebuild esp-open-sdk - it used default vendor config for lwIP which is known to be not very memory efficient.

Generally, this error means that "OS" is running out of heap space (that's different heap than uPy heap). Usual cause is receiving from network too slow, or vice-versa, sending too fast. Or rather, not sending fast, but sending too little data per packet, which causes too many packets allocated. We will need to find good networking settings which balance memory use and other params. All in all, first step is to try the latest esp-open-sdk, as I wrote.

@peterhinch

This comment has been minimized.

Copy link
Contributor

peterhinch commented Apr 9, 2016

I updated and built esp-open-sdk, and built updated firmware. It ran for longer but the server side still failed after roughly 30 seconds (~300 80-byte messages).

Connection address: ('192.168.0.24', 7751)
received 80 bytes
E:M 656
Traceback (most recent call last):
  File "<stdin>", line 30, in <module>
OSError: 12
>>> 
>>> 
PYB: soft reboocould not find module 'boot'
MicroPython v1.6-388-gf38e8f5-dirty on 2016-04-09; ESP module with ESP8266
Type "help()" for more information.
>>> 

Note it rebooted spontaneously and the reboot message seems corrupted - I suppose unsurprising after a memory fault. In previous testing I reduced the repetition rate to one message per second, but this had no effect on the fault.

@pfalcon

This comment has been minimized.

Copy link
Member

pfalcon commented Apr 10, 2016

Then welcome to TIME-WAIT hell. Here's random link: http://www.serverframework.com/asynchronousevents/2011/01/time-wait-and-its-design-implications-for-protocols-and-scalable-servers.html . TL;DR: If you write a TCP server which accept short frequent client connections, then you're doing it wrong - TCP protocol is not designed for such usage, but rather for a long-living, persistent connections. On big servers, the issue manifests itself with tens of thousands of concurrent connections on real OSes, thousands on Windows (concurrent connections can be emulated to some extent using load of fast sequential connections).

It's HUGE success that ESP8266, having around 100,000 times less memory than servers above, can sustain more than 100 outstanding connections (based on apache bench testing).

@pfalcon

This comment has been minimized.

Copy link
Member

pfalcon commented Apr 10, 2016

esp.freemem() function was added to provide insight into "OS" heap. Note: OS and Python heaps are independent, OS heap is what used for networking buffers and control blocks.

@peterhinch

This comment has been minimized.

Copy link
Contributor

peterhinch commented Apr 11, 2016

Fair enough - I'll leave this to guys with TCP/IP experience. I merely converted @Adixy 's code into something which could be cut'n'pasted and run.

For what it's worth I think the ESP8266 port is awesome :)

@Adixylian

This comment has been minimized.

Copy link

Adixylian commented Apr 11, 2016

I am still having troubles with xtensa-lx106-elf-gcc, but when I fix it I will give it another try
Hopefully I'll get those sockets to work, one way or another.

@pfalcon

This comment has been minimized.

Copy link
Member

pfalcon commented Apr 11, 2016

Fair enough - I'll leave this to guys with TCP/IP experience. I merely converted @Adixy 's code into something which could be cut'n'pasted and run.

Yes, we'll definitely need to more research on how esp8266 works and how much it can do. And by we, I really mean both us developers and users. First step is to on the same line and use good testcases. We recommend official examples as shipped in the repo: https://github.com/micropython/micropython/tree/master/examples/network . They are minimal to exclude interference from MicroPython side of things and show just networking at work. (They're also portable and work out of the box on unix and esp8266).

From our side, we're going to provide more diagnostic functions to give everyone better insight into what happens behind the scenes, what resource may be exhausted, etc. esp.freemem() is just first step. And I personally wished I could take a deep dive into researching this stuff - that's what I went for esp8266 in the first place ;-). But in just few weeks we need to provide a general availability release to end users, so need to work on many "mundane" application-level functionality instead first.

@Adixylian

This comment has been minimized.

Copy link

Adixylian commented Apr 11, 2016

So I finally got it to work
But, the transmission rate was 1 per second, using the gc.collect
It was smth like this -->
while True:
.......
pyb.delay(1000)
gc.collect()

Like that it transmitted more than 400 times a data of 80bytes without showing any error, so I turned it off. Usually it would be around 150 times of 80bytes, which is great in comparrison with this last test. Actually, this is problem for a longer time --> #1479
On the other hand that is too slow. Probably ESP's RAM gets overwhelmed when the rate is too fast, in other words, gc.collect() doesn't recognize all the garbage cause there is too little amount of time.
That is just my opinion.
If I'm right, than it would be good to make a special module just for collecting the garbage made from sockets at much higher rate. Or to do something simillar, that would delete all the previous data when using sockets

I was using the same firmware that had E:M1584.

I hope this helped

@zsquareplusc

This comment has been minimized.

Copy link
Contributor

zsquareplusc commented Nov 8, 2016

Seems that i've also run into this issue with my web server impl. running on 1.8.5. With few connections per time, it runs quite long and with several connections per second it get ENOMEM within seconds.

I was thinking if close of the socket starts the shutdown but does not wait for completion. So if the Python loop calls accept quickly after that, then the underlying OS has two open connections. if further connections are made very quickly, their heap probably fragments. If you give it time to close all sockets in may recover. maybe? in that case, a blocking socket.close (or some other means that allows to check if socket shutdown is completed) may improve the situation significantly.

@pfalcon

This comment has been minimized.

Copy link
Member

pfalcon commented Nov 8, 2016

I was thinking if close of the socket starts the shutdown but does not wait for completion.

Read about TCP's TIME_WAIT state.

@zsquareplusc

This comment has been minimized.

Copy link
Contributor

zsquareplusc commented Nov 9, 2016

That's it.

You could have mentioned that, on the esp, there is even a nice, undocumented, utility function in the current version, that illustrates this very well :-)
>>> socket.print_pcbs()
it shows the listening socket as well as the ones in TIME_WAIT state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.