Skip to content
This repository has been archived by the owner on May 12, 2018. It is now read-only.

inttest/ct3: fix travis-ci breakage #421

Merged
merged 1 commit into from Jan 14, 2015
Merged

inttest/ct3: fix travis-ci breakage #421

merged 1 commit into from Jan 14, 2015

Conversation

ghost
Copy link

@ghost ghost commented Jan 2, 2015

No description provided.

@ghost
Copy link
Author

ghost commented Jan 2, 2015

Let's see if this works better on travis-ci.

@ghost
Copy link
Author

ghost commented Jan 2, 2015

Before:

{error_logger,{{2015,1,1},{13,48,39}},
    "Can't set long node name!\nPlease check your configuration\n",[]}

After:

{error_logger,{{2015,1,2},{21,22,11}},
  \"Protocol: ~tp: the name ct_rt3@localhost seems to be in use by another Erlang node\",
[\"inet_tcp\"]}

@ferd
Copy link
Contributor

ferd commented Jan 5, 2015

What happens if we generate a random name? I'm wondering if the issue is actually the name or something regarding their network and instances or something entirely different.

@ghost
Copy link
Author

ghost commented Jan 5, 2015

@ghost
Copy link
Author

ghost commented Jan 9, 2015

Modified Makefile to test and gather info.

@ghost
Copy link
Author

ghost commented Jan 9, 2015

Failed:

  • -name ct3

Succeeded:

  • -sname ct3
  • -name ct3@127.0.0.1
  • -name ct3@::1
  • -sname ct3@::1

@ferd
Copy link
Contributor

ferd commented Jan 9, 2015

So that's a sign about long names and short names. Erlang considers a domain to be long if it's got a period in it (.) and short otherwise. Or rather, it perceives a host with a . in it to not be short, and to accept mostly anything submitted for a long name:

https://github.com/erlang/otp/blob/maint/lib/kernel/src/net_kernel.erl#L1244-L1268

So 127.0.0.1 is forced long, but ::1 is either. I'm guessing that's why it fails to set the short name with a 127.0.0.1 address, while ct3@::1 works.

For the auto-detection, we'd have to look at the output of {inet_db:gethostname(),inet_db:res_option(domain)} which would explain fairly quickly why it fails.

@ferd
Copy link
Contributor

ferd commented Jan 9, 2015

So you see the problem in the output: {inet_db:gethostname(), inet_db:res_option(domain)} -> {"localhost",[]} That domain value is not set. It can be set by hand with https://github.com/erlang/otp/blob/maint/lib/kernel/src/inet_db.erl#L211 , but in practice, it should be done by the system: https://github.com/erlang/otp/blob/172e812c491680fbb175f56f7604d4098cdc9de4/lib/kernel/src/inet_config.erl#L250-L261

https://github.com/erlang/otp/blob/172e812c491680fbb175f56f7604d4098cdc9de4/lib/kernel/src/inet_config.erl#L231-L245

I'm guessing this bit here has a single name found and ends up not setting both values. Not sure why that happens exactly.

@ghost
Copy link
Author

ghost commented Jan 14, 2015

Before I test the container-based workers, test binding to 127.0.0.1.

@ferd
Copy link
Contributor

ferd commented Jan 14, 2015

Well that seems to work. If we can rewrite tests to use that, that would probably be best so our build status is accurate again. In any case I'm not sure we should depend on a properly configured host file when possible to avoid, given that may turn out contributors will also have a broken config on their local machine.

@ghost
Copy link
Author

ghost commented Jan 14, 2015

It's a last resort, and it has problems like being IPv4-specific and having to remember that any and all distributed nodes we start use @127.0.0.1.

@ghost
Copy link
Author

ghost commented Jan 14, 2015

Now, let's try container-based workers.

@ferd
Copy link
Contributor

ferd commented Jan 14, 2015

That seemed to go well too.

@ghost
Copy link
Author

ghost commented Jan 14, 2015

Yes, @BanzaiMan's suggestion to use container-based workers fixed it.
I'd prefer to go with this, considering it worked correctly the way it's supposed to and the name looks sane (ct_run -noshell [...] -name test@testing-worker-linux-docker-f95833d6-3233-linux-5.prod.travis-ci.org).

@ghost
Copy link
Author

ghost commented Jan 14, 2015

Cleaned up branch and commit.

Spinning up a distributed Erlang node by running 'erl -name ct3' failed
due to FQDN issues. Hiro Asari from Travis-CI suggested to try out the
new container-based workers as a fix, and that one works as expected
because we get a proper FQDN. Therefore, make the switch to
container-based workers.
@ferd
Copy link
Contributor

ferd commented Jan 14, 2015

Sweet, thanks for the fix. Merging.

ferd added a commit that referenced this pull request Jan 14, 2015
inttest/ct3: fix travis-ci breakage
@ferd ferd merged commit 890390b into rebar:master Jan 14, 2015
@ghost ghost deleted the fix-travis-ci-shortname branch January 14, 2015 20:21
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
1 participant