Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilation to arm fails #102

Closed
finalclass opened this issue Apr 21, 2022 · 26 comments
Closed

Compilation to arm fails #102

finalclass opened this issue Apr 21, 2022 · 26 comments

Comments

@finalclass
Copy link

finalclass commented Apr 21, 2022

Describe the bug
Compiling to ARM does not work

To Reproduce

$ GOARCH=arm go build

Expected behavior
Compilation works fine

Actual behaviour
This error is displayed:

# github.com/ergo-services/ergo/lib/osdep
../../go/pkg/mod/github.com/ergo-services/ergo@v1.999.210/lib/osdep/linux.go:15:11: invalid operation: usage.Utime.Sec * 1000000000 + usage.Utime.Nano() (mismatched types int32 and int64)
../../go/pkg/mod/github.com/ergo-services/ergo@v1.999.210/lib/osdep/linux.go:16:11: invalid operation: usage.Stime.Sec * 1000000000 + usage.Stime.Nano() (mismatched types int32 and int64)
# github.com/ergo-services/ergo/lib
../../go/pkg/mod/github.com/ergo-services/ergo@v1.999.210/lib/tools.go:166:11: cannot use 4294967000 (untyped int constant) as int value in assignment (overflows)

Environment (please complete the following information):

  • Arch: arm
  • OS: Linux
  • Framework Version [v1.999.210]
  • Number of CPU or (GOMAXPROCS not set)

Additional context
Removing GOARCH fixes it however I need to run few services on ARM

@halturin
Copy link
Collaborator

Thanks for the report. I have tested it on arm64 only. Will take a look

@finalclass
Copy link
Author

finalclass commented Apr 22, 2022

Thanks for your response. Indeed it compiles fine for arm64. However when I try to run it on Asus Tinker Edge T it hangs and after a while the asus shuts off.
To reproduce:

cd ergo/examples/simple
GOARCH=arm64 go build simple.go
// scp to arm64 device

on the arm64 device:

./simple

The result: it hangs. I have to restart the device.
Occasionally I get this error log:

Message from syslogd@vexing-eft at Apr 22 08:13:22 ...
 kernel:[   64.830251] Internal error: undefined instruction: 0 [#1] PREEMPT SMP

Message from syslogd@vexing-eft at Apr 22 08:13:22 ...
 kernel:[   64.964868] Process simple (pid: 4048, stack limit = 0xffff000013498000)

Message from syslogd@vexing-eft at Apr 22 08:13:22 ...
 kernel:[   65.199004] Code: d2800014 54000be1 b9404004 f9401c03 (23232323)

However usually there is no log at all.

I debugged it a little and found that it hangs on this line: https://github.com/ergo-services/ergo/blob/79bebaa/proto/dist/resolver.go#L225
However the value of the dsn variable seams to be correct: localhost:4369

On my local PC it works fine (but I have erlang installed) but on the arm machine it does not. Could you give me some clues, I would really appreciate it.

@halturin
Copy link
Collaborator

halturin commented Apr 22, 2022 via email

@halturin
Copy link
Collaborator

halturin commented Jun 29, 2022

Sorry for the delay. I still don't have anything similar to your device.

May I ask you to try these fixes:
/ergo-services/ergo@v1.999.210/lib/tools.go:166
replace:

  limit = 4294967000

by

  limit = math.MaxInt

it also requires to add "math" module to the import section.

And update ResourceUsage function in ergo-services/ergo@v1.999.210/lib/osdep/linux.go
with this code

	var usage syscall.Rusage
	var utime, time int64
	if err := syscall.Getrusage(syscall.RUSAGE_SELF, &usage); err == nil {
		utime = int64(usage.Utime.Sec)*1000000000 + usage.Utime.Nano()
		stime = int64(usage.Stime.Sec)*1000000000 + usage.Stime.Nano()
	}
	return utime, stime

Or you can try this branch https://github.com/ergo-services/ergo/tree/fixarm

@finalclass
Copy link
Author

Sure I will try it out. However I'm OOO currently and will be able to check it early next week.

@finalclass
Copy link
Author

Hi,
Unfortunately this does not seam to help. I've added go.mod to ergo/examples/simple:

module simpl.com/simple

go 1.18

replace github.com/ergo-services/ergo => ../../
replace github.com/ergo-services/ergo/etf => ../../etf
replace github.com/ergo-services/ergo/gen => ../../gen
replace github.com/ergo-services/ergo/node => ../../node
require github.com/ergo-services/ergo v1.999.211 // indirect

and then I've made the fixes you've mentioned. There was one mistake on line:

var utime, time int64

I assume it should be:

var utime, stime int64

I've built everything and ran the script again but it hanged the same way it did last time.

@halturin
Copy link
Collaborator

halturin commented Jul 4, 2022

so there were a few issues

  • compilation (it's been fixed, I assume).
  • run built binary on arm hardware (it hangs on start, I suppose)

It seems I need to get the same HW (or a similar one) or a way to run it somehow in the VM (I have no clue how to do this so far).

@finalclass
Copy link
Author

Yes the compilation issue has been fixed (actually I was compiling to the wrong target: it should be arm64, not just arm).
The running part appears to still be a problem. Let me know if you wish me to test anything else on the device.

@halturin
Copy link
Collaborator

halturin commented Jul 5, 2022

I finally bought this device. Waiting for the shipment.

@halturin
Copy link
Collaborator

I've just tested simple example on my tinker board. No issues (master branch)
image

Could you please check on your board the same way? You may also want to add -ergo.trace to see extra debug info. Like this...

mendel@tinker:~$ ./simple -ergo.trace
2022/07/27 10:09:15 Start node with name "node@localhost" and cookie "cookies"
2022/07/27 10:09:15 Node listening range: 15000...65000
2022/07/27 10:09:15 Started embedded EMPD service and listen port: 4369
2022/07/27 10:09:15 EPMD accepted new connection from [::1]:46830
2022/07/27 10:09:15 Request from EPMD client: [0 22 120 58 152 77 0 0 6 0 5 0 4 110 111 100 101 0 4 17 59 1 0 0]
2022/07/27 10:09:15 [node@localhost] EPMD client: node registered
2022/07/27 10:09:15 [node@localhost] CORE registering behavior "erlang" in group "ergo:applications"
2022/07/27 10:09:15 [node@localhost] CORE registering process: <2C323E75.0.1001>
2022/07/27 10:09:15 [node@localhost] CORE spawn a new process <2C323E75.0.1001> (registered name: "")
2022/07/27 10:09:15 [node@localhost] CORE registering name (<2C323E75.0.1002>): net_kernel_sup
2022/07/27 10:09:15 [node@localhost] CORE registering process: <2C323E75.0.1002>
2022/07/27 10:09:15 [node@localhost] CORE spawn a new process <2C323E75.0.1002> (registered name: "net_kernel_sup")
2022/07/27 10:09:15 [node@localhost] SUPERVISOR "net_kernel_sup" with restart strategy: one_for_one[permanent]
2022/07/27 10:09:15 [node@localhost] CORE registering name (<2C323E75.0.1003>): net_kernel
2022/07/27 10:09:15 [node@localhost] CORE registering process: <2C323E75.0.1003>
2022/07/27 10:09:15 [node@localhost] CORE spawn a new process <2C323E75.0.1003> (registered name: "net_kernel")
2022/07/27 10:09:15 NET_KERNEL: Init: []etf.Term(nil)
2022/07/27 10:09:15 [node@localhost] LINK process: <2C323E75.0.1002> => <2C323E75.0.1003>
2022/07/27 10:09:15 [node@localhost] CORE registering name (<2C323E75.0.1004>): global_name_server
2022/07/27 10:09:15 [node@localhost] CORE registering process: <2C323E75.0.1004>
2022/07/27 10:09:15 [node@localhost] CORE spawn a new process <2C323E75.0.1004> (registered name: "global_name_server")
2022/07/27 10:09:15 [node@localhost] LINK process: <2C323E75.0.1002> => <2C323E75.0.1004>
2022/07/27 10:09:15 [node@localhost] CORE registering name (<2C323E75.0.1005>): rex
2022/07/27 10:09:15 [node@localhost] CORE registering process: <2C323E75.0.1005>
2022/07/27 10:09:15 [node@localhost] CORE spawn a new process <2C323E75.0.1005> (registered name: "rex")
2022/07/27 10:09:15 REX: Init: []etf.Term(nil)
2022/07/27 10:09:15 [node@localhost] CORE registering behavior "erpc" in group "ergo:remote"
2022/07/27 10:09:15 [node@localhost] LINK process: <2C323E75.0.1002> => <2C323E75.0.1005>
2022/07/27 10:09:15 [node@localhost] CORE registering name (<2C323E75.0.1006>): observer_backend
2022/07/27 10:09:15 [node@localhost] CORE registering process: <2C323E75.0.1006>
2022/07/27 10:09:15 [node@localhost] CORE spawn a new process <2C323E75.0.1006> (registered name: "observer_backend")
2022/07/27 10:09:15 OBSERVER: Init: []etf.Term(nil)
2022/07/27 10:09:15 [node@localhost] RPC provide: proc_lib:translate_initial_call (gen.RPC)(0x1fc500)
2022/07/27 10:09:15 [node@localhost] RPC provide: appmon_info:start_link2 (gen.RPC)(0x1fc740)
2022/07/27 10:09:15 [node@localhost] LINK process: <2C323E75.0.1002> => <2C323E75.0.1006>
2022/07/27 10:09:15 [node@localhost] CORE registering name (<2C323E75.0.1007>): erlang
2022/07/27 10:09:15 [node@localhost] CORE registering process: <2C323E75.0.1007>
2022/07/27 10:09:15 [node@localhost] CORE spawn a new process <2C323E75.0.1007> (registered name: "erlang")
2022/07/27 10:09:15 ERLANG: Init: []etf.Term(nil)
2022/07/27 10:09:15 [node@localhost] LINK process: <2C323E75.0.1002> => <2C323E75.0.1007>
2022/07/27 10:09:15 [node@localhost] LINK process: <2C323E75.0.1001> => <2C323E75.0.1002>
2022/07/27 10:09:15 [node@localhost] CORE registering name (<2C323E75.0.1008>): gs1
2022/07/27 10:09:15 [node@localhost] CORE registering process: <2C323E75.0.1008>
2022/07/27 10:09:15 [node@localhost] CORE spawn a new process <2C323E75.0.1008> (registered name: "gs1")
2022/07/27 10:09:15 [node@localhost] CORE route message by pid (local) <2C323E75.0.1008>
2022/07/27 10:09:15 [node@localhost] GEN_SERVER <2C323E75.0.1008> got message from <2C323E75.0.1008>
2022/07/27 10:09:15 m: 100
HandleInfo: 100
2022/07/27 10:09:16 [node@localhost] CORE route message by pid (local) <2C323E75.0.1008>
2022/07/27 10:09:16 [node@localhost] GEN_SERVER <2C323E75.0.1008> got message from <2C323E75.0.1008>
2022/07/27 10:09:16 m: 101
HandleInfo: 101
2022/07/27 10:09:17 [node@localhost] CORE route message by pid (local) <2C323E75.0.1008>
2022/07/27 10:09:17 [node@localhost] GEN_SERVER <2C323E75.0.1008> got message from <2C323E75.0.1008>
2022/07/27 10:09:17 m: 102
HandleInfo: 102
2022/07/27 10:09:18 [node@localhost] CORE route message by pid (local) <2C323E75.0.1008>
2022/07/27 10:09:18 [node@localhost] GEN_SERVER <2C323E75.0.1008> got message from <2C323E75.0.1008>
2022/07/27 10:09:18 m: 103
HandleInfo: 103
2022/07/27 10:09:19 [node@localhost] CORE route message by pid (local) <2C323E75.0.1008>
2022/07/27 10:09:19 [node@localhost] GEN_SERVER <2C323E75.0.1008> got message from <2C323E75.0.1008>
2022/07/27 10:09:19 m: 104
HandleInfo: 104
2022/07/27 10:09:20 [node@localhost] CORE route message by pid (local) <2C323E75.0.1008>
2022/07/27 10:09:20 [node@localhost] GEN_SERVER <2C323E75.0.1008> got message from <2C323E75.0.1008>
2022/07/27 10:09:20 m: 105
HandleInfo: 105
2022/07/27 10:09:20 [node@localhost] CORE unregistering process: <2C323E75.0.1008>
2022/07/27 10:09:20 [node@localhost] CORE unregistering name (<2C323E75.0.1008>): gs1
2022/07/27 10:09:20 [node@localhost] MONITOR process terminated: <2C323E75.0.1008>
exited
2022/07/27 10:09:20 accept tcp 127.0.0.1:15000: use of closed network connection
mendel@tinker:~$

@finalclass
Copy link
Author

Sorry for the late answer.
That's quite strange becuase for me it still does not work. I don't have any customizations on the device, just a bare tinker os installation.

$ uname -a
Linux vexing-eft 4.14.98-imx #1 SMP PREEMPT Wed Jun 9 15:32:53 UTC 2021 aarch64 GNU/Linux

image

It seams that we even have the same kernel versions (only the compilation time is different).

@halturin
Copy link
Collaborator

halturin commented Aug 9, 2022

what version of golang are you using? OS, distro?

PS: forgot to mention. I've updated simple.go by adding flag.Parse() in order to use -ergo.trace
PPS: you may contact me via telegram halturin for instant messaging.

@halturin
Copy link
Collaborator

halturin commented Aug 9, 2022

just to make sure I've tested one more time (recently updated my OS environment)
image

@finalclass
Copy link
Author

I'm compiling it on:

$ go version
go version go1.19 linux/amd64
$ uname -a
Linux rog 5.15.59-1-MANJARO #1 SMP PREEMPT Wed Aug 3 11:20:04 UTC 2022 x86_64 GNU/Linux

@halturin
Copy link
Collaborator

May I ask you to build it using golang 1.18?

@finalclass
Copy link
Author

Unfortunately it's the same.

image

@halturin
Copy link
Collaborator

could you please add flag.Parse() into the main function of simple.go and run it with -ergo.trace on this board?

@halturin
Copy link
Collaborator

halturin commented Aug 11, 2022

btw, is this hostname (or localhost) present in the /ets/hosts?

@finalclass
Copy link
Author

image

image

@halturin
Copy link
Collaborator

thanks for the quick reply. I'll take a look at why the node couldn't connect to itself on the 4369.
PS: do you have any firewall settings on this board?

@halturin
Copy link
Collaborator

and the last thing I would like to ask you... could you please try the v220 branch? It's got some changes in the network module.

@finalclass
Copy link
Author

This got me thinking. If it hangs on trying to connect to itself then a simple http server should also be broken:

image

So I guess you can close this issue because it's clearly not related to the ergo library.

@finalclass
Copy link
Author

finalclass commented Aug 11, 2022

There is one difference: with the http server it does not hang (the server). I can close it with Ctrl+C. When it comes to curl it hangs.

@halturin
Copy link
Collaborator

halturin commented Aug 11, 2022

I guess there must be a firewall setting that drops any incoming connections. You may check this out with iptables -L and iptables -F to flush them out. Otherwise, I have no clue what it could be. Looks weird.

@finalclass
Copy link
Author

It seams that it's not a firewall

image

When I find some time I will try to reinstall the system.

@halturin
Copy link
Collaborator

Since this bug is not related to ergo I'm closing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants