Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Websocket loop executing only once. #663

Closed
jeet-parekh opened this issue Jun 4, 2018 · 12 comments · Fixed by #802
Closed

Websocket loop executing only once. #663

jeet-parekh opened this issue Jun 4, 2018 · 12 comments · Fixed by #802
Milestone

Comments

@jeet-parekh
Copy link

jeet-parekh commented Jun 4, 2018

I'm trying to load test a websocket server. But for some reason, the vus doesn't affect the test at all. I'm not sure if this is a bug. I may have misunderstood the docs. Or my test script may be wrong.

Here's a simple echo server program to explain what's going wrong. The server would echo back the same message 10 times.

Here's my server code in Go.

package main

import (
	"fmt"
	"net/http"

	"github.com/gorilla/websocket"
)

var upgrader = websocket.Upgrader{}

func echo(w http.ResponseWriter, r *http.Request) {
	c, err := upgrader.Upgrade(w, r, nil)
	if err != nil {
		fmt.Println("upgrade:", err.Error())
		return
	}

	defer c.Close()

	for {
		mt, message, err := c.ReadMessage()
		if err != nil {
			fmt.Println("read:", err.Error())
			break
		}

		fmt.Println("recv:", string(message))

		for i := 0; i < 10; i++ {
			err = c.WriteMessage(mt, []byte("echo: "+string(message)))
			if err != nil {
				fmt.Println("write:", err.Error())
				break
			}
		}
	}
}

func main() {
	http.HandleFunc("/", echo)
	http.ListenAndServe("localhost:8016", nil)
}

And here's my test script

import ws from "k6/ws";
import { check, sleep } from "k6";

export let options = {
    vus: 10,
    iterations: 1,
};

export default function () {
    let url = "ws://127.0.0.1:8016";
    let params = null;

    let res = ws.connect(url, params, function (socket) {
        socket.on('open', function () {
            console.log("connected")
        });

        socket.on('message', function (data) {
            console.log("Message received: ", data);
        });

        socket.on('close', function () {
            console.log('disconnected');
        });

        socket.send('msg')
        socket.close()
    });

    check(res, { "status is 101": (r) => r && r.status === 101 });
}

          /\      |‾‾|  /‾‾/  /‾/   
     /\  /  \     |  |_/  /  / /   
    /  \/    \    |      |  /  ‾‾\  
   /          \   |  |‾\  \ | (_) | 
  / __________ \  |__|  \__\ \___/ .io

  execution: local
     output: -
     script: wstest.js

    duration: -,   iterations: 1
         vus: 10, max: 10

INFO[0001] disconnected                                 
INFO[0001] connected                                    
    done [==========================================================] 1 / 1

    ✓ status is 101

    checks................: 100.00% ✓ 1    ✗ 0   
    data_received.........: 129 B   9.2 EB/s
    data_sent.............: 202 B   9.2 EB/s
    iteration_duration....: avg=824.93µs min=824.93µs med=824.93µs max=824.93µs p(90)=824.93µs p(95)=824.93µs
    iterations............: 1       +Inf/s
    vus...................: 10      min=10 max=10
    vus_max...............: 10      min=10 max=10
    ws_connecting.........: avg=391.66µs min=391.66µs med=391.66µs max=391.66µs p(90)=391.66µs p(95)=391.66µs
    ws_msgs_sent..........: 1       +Inf/s
    ws_session_duration...: avg=671.58µs min=671.58µs med=671.58µs max=671.58µs p(90)=671.58µs p(95)=671.58µs
    ws_sessions...........: 1       +Inf/s

There are 10 users. So I expected to see ws_msgs_sent = 10. The console.log("Message received: ", data); part of the test script doesn't come up in the output. And also there isn't any ws_msgs_received in the output. I would appreciate a correction if I'm doing it wrong.

@na--
Copy link
Member

na-- commented Jun 4, 2018

The problem is that iterations are shared between all of the VUs. So if you run it with --vus 10 --iterations 10, each VU would run the script once. In your case, only one VU actually runs the script a single time and the rest just wait, doing nothing.

I'll leave this issue open, since I feel like this misunderstanding is common enough to indicate we don't communicate this clearly enough. Improving the documentation and CLI help would probably help with ameliorating it, but we should probably also emit a warning if users specify a number of iterations that's less than the number of VUs.

@jeet-parekh
Copy link
Author

jeet-parekh commented Jun 4, 2018

Got it. If vus are iterations are same, is it guaranteed that each VU would run the script exactly once?

Is this a proper way to handle a situation where the connection should be closed only after n number of messages have been read?

socket.setInterval(function timeout() {
    if (myCounter >= n) {
        socket.close()
    }
}, 1000);

@na--
Copy link
Member

na-- commented Jun 5, 2018

I wouldn't say it's guaranteed. If you have a very, very short default function, it's theoretically possible for one VU to run two iterations. Regarding the websocket closing after n messages, what you've shown seems pretty OK.

@jeet-parekh
Copy link
Author

I noticed that happening yesterday, after adding a bunch of console.logs. This adds a bit of unreliability in the tests.

it's theoretically possible for one VU to run two iterations

Is there a reason for this approach / behavior? Or is it something that is to be changed in the future?

@na--
Copy link
Member

na-- commented Jun 5, 2018

The reason for the current approach is that k6 is primarily a load-testing tool and usually there's no reason for VUs to sit idle when there's work to be done. When there is, you can explicitly add slack with sleep() or with ramping up and down via stages, but when you just have a duration and VUs, all VUs are constantly looping through the default function for the entire duration. When you use iterations instead of duration, all specified VUs are still constantly executing the default function and "stealing" work (iterations) from the common pile. For additional flexibility, we plan to add another execution model soon - arrival-rate based VUs. See the linked issue for more information about the differences between it and the current looping VUs.

I think that generally it shouldn't matter all that much which VU executes a script iteration, even if you don't use k6 only for simple load testing. If you have a complicated scenario/use case or want do also have some functional testing, using the __ITER and __VU execution context variables should give you something like a unique ID for a particular iteration. Can you tell me what your use case is and explain a bit why this work-stealing approach to iterations adds unreliability?

@jeet-parekh
Copy link
Author

A part of the test is for a custom publisher-subscriber messaging broker (similar to Apache Kafka, PubNub, Pusher).

A simple test case would be having 10 users and 10 channels. Each user would publish a message to each channel 10 times. So there would be 1000 messages being sent, and 10,000 messages being received (the publisher would receive the published message as well - optional). This relates to my previous question - VU should stop executing only after it has received 1000 messages.

Before the message transactions start, there are a few other operations which would help the server uniquely identify each connected client. Ignoring the numbers, here I would need each VU to handle its own and not steal other messages.

@na--
Copy link
Member

na-- commented Jun 5, 2018

It seems like you shouldn't have any issues if you run with --vus 10 --iterations 10, since there's a substantial amount of work, each VU would execute only one iteration.

@jeet-parekh
Copy link
Author

jeet-parekh commented Jun 6, 2018

I got a couple of steals with this on some random test runs. There are some other small tests also which would require the VUs to not steal tasks. Majority of my tests require a specific number of VUs to be connected to the server and running simultaneously.

I saw the link you shared about __VU and __ITER. I'm not sure about it though. Say I add this block (or any other equivalent logic to prevent a VU from executing its second iteration) to the beginning.

if (__ITER !== 0) {
    return
}

It would mean that the VU started another iteration, i.e. stole a task. But because it has already iterated once, it would exit. It wouldn't put the task back for another VU to handle it. Correct?

@na--
Copy link
Member

na-- commented Jun 6, 2018

Majority of my tests require a specific number of VUs to be connected to the server and running simultaneously

But that's just what VUs are. The vus setting is just controlling the maximum concurrency of the execution, if you specify 10, you'd never have more than 10 connected to the server. If you have more iterations than VUs and a VU is starting its second iteration before the rest, that just means it finished its first iteration faster than the others. I still don't quite understand why your use case needs this precision and why it matters so much that iterations are evenly distributed, but I'll try to help as much as possible 😄

It wouldn't put the task back for another VU to handle it. Correct?

Yes, once the default function runs, that VU is already executing an iteration. If you return early from it, that iteration is considered finished and if there are any remaining iterations in the "pool", the VU would just continue looping through and executing them. My point about __VU and __ITER was that you could use them to uniquely identify the "client". As you later mentioned:

Before the message transactions start, there are a few other operations which would help the server uniquely identify each connected client.

That's kind of the purpose of the __VU and __ITER pair.

But if you actually need that much precision, maybe you'd benefit a lot more from sleep() so you could artificially lengthen iteration times... If you really need uniform distribution of the iterations between the VUs, you could get the current timestamp in the beginning of the default function (for example with Date.now()), get the timestamp at the end and then sleep() for different amounts in each iteration so the final iteration time in every VU is roughly the same.

@jeet-parekh
Copy link
Author

I really appreciate you trying to help me.

We came up with a way to handle it. Just for testing purposes, we changed the code a bit so that the server sends a message at the end singaling the client to disconnect. Then it worked the way we wanted it to. k6 has been extremely useful.

If you're open to feature requests, I would request for a flag to have each VU run only a single iteration. Something like --single-iter, or --no-steal or anything you think fits.

Thanks a lot for your time.

@na--
Copy link
Member

na-- commented Jun 6, 2018

Thanks, we're quite open to feature requests and your use case is a valuable data point 😄

We previously considered adding --per-vu-iterations N or something like that (so your use case would be covered by --per-vu-iterations 1), but for other reasons. In our case we realized that when you specify iterations instead of duration in a distributed execution environment (i.e. cloud/cluster mode), k6 instances on different machines won't be able to easily process ("steal") iterations from the total specified number. Doing it like we currently do local execution would involve a lot of cumbersome and error-prone network synchronization between the different k6 nodes.

@jeet-parekh
Copy link
Author

Makes sense. That would be a really useful feature though.

Thanks again 😄
Keeping this issue open like you wanted.

@robingustafsson robingustafsson added this to the v1.0.0 milestone Jul 6, 2018
na-- added a commit that referenced this issue Oct 10, 2018
Combined with mentions in the readme, docs and CLI help this should be enough to prevent confusion, so we can close #663
@na-- na-- closed this as completed in #802 Oct 10, 2018
@na-- na-- mentioned this issue May 15, 2019
39 tasks
@na-- na-- modified the milestones: v1.0.0, v0.27.0 May 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants