Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Custom_sd] Impossible to remove targets from the discovery #4573

Closed
Nexucis opened this Issue Sep 4, 2018 · 4 comments

Comments

Projects
None yet
2 participants
@Nexucis
Copy link
Contributor

Nexucis commented Sep 4, 2018

Bug Report

What did you do?
Write code to implement custom service discovery;

What did you expect to see?
I expect that the targets that don't exist are removed in the custom_sd.json.

What did you see instead? Under which circumstances?
When a target is removed in the custom discovery, there is no way to remove the targetGroup in the updated channel

Environment

  • System information:

    • Windows10 64bit,
    • Darwin 16.7.0 x86_64
  • Prometheus version:

2.3.2

  • GO Version :

    • 1.10.1 windows/amd64
    • 1.11 darwin/amd64

Demo

I write a short demo to reproduce the bug

type File struct {
	Source      string `yaml:"source"`
	Address     string `yaml:"address"`
	Application string `yaml:"application"`
}

func readFile() ([]*File, error) {
	b, err := ioutil.ReadFile("./data.yaml")

	if err != nil {
		return nil, err
	}
	var res []*File
	err = yaml.UnmarshalStrict(b, &res)
	return res, err
}

func (d *discovery) parseServiceNodes(f *File) *targetgroup.Group {
	tgroup := targetgroup.Group{
		Source: f.Source,
		Labels: make(model.LabelSet),
	}

	tgroup.Targets = make([]model.LabelSet, 0, 1)

	target := model.LabelSet{model.AddressLabel: model.LabelValue(f.Address)}
	labels := model.LabelSet{
		model.AddressLabel:                    model.LabelValue(f.Address),
		model.LabelName("__meta_application"): model.LabelValue(f.Application),
		model.LabelName("__meta_source"):      model.LabelValue(f.Source),
	}

	tgroup.Labels = labels
	tgroup.Targets = append(tgroup.Targets, target)
	return &tgroup
}

func (d *discovery) Run(ctx context.Context, ch chan<- []*targetgroup.Group) {
	for c := time.Tick(time.Duration(d.refreshInterval) * time.Second); ; {
		file, err := readFile()
		if err != nil {
			level.Error(d.logger).Log("msg", "Error reading services list", "err", err)
			time.Sleep(time.Duration(d.refreshInterval) * time.Second)
			continue
		}

		var tgs []*targetgroup.Group

		for _, f := range file {

			tgs = append(tgs, d.parseServiceNodes(f))
		}

		ch <- tgs
		// Wait for ticker or exit when ctx is closed.
		select {
		case <-c:
			continue
		case <-ctx.Done():
			return
		}
	}
}

The data.yaml file is "a service" that giving me all the targets that should be scraped.

When the code is running change the source in the file "data.yaml" like the following step :

  1. With this data.yaml :
- source: "source2"
  address: "localhost:1954"
  application: "PYM"

We have this custom_sd.json :

[
    {
        "targets": [
            "localhost:1954"
        ],
        "labels": {
            "__address__": "localhost:1954",
            "__meta_application": "PYM",
            "__meta_source": "source2"
        }
    }
]
  1. Now I just change the source in the file data.yaml:
- source: "source1"
  address: "192.168.1.12:9090"
  application: "FOO"

And I have this file as result :

[
    {
        "targets": [
            "localhost:1954"
        ],
        "labels": {
            "__address__": "localhost:1954",
            "__meta_application": "PYM",
            "__meta_source": "source2"
        }
    },
    {
        "targets": [
            "192.168.1.12:9090"
        ],
        "labels": {
            "__address__": "192.168.1.12:9090",
            "__meta_application": "FOO",
            "__meta_source": "source1"
        }
    }
]

Instead of having this one :

[
    {
        "targets": [
            "192.168.1.12:9090"
        ],
        "labels": {
            "__address__": "192.168.1.12:9090",
            "__meta_application": "FOO",
            "__meta_source": "source1"
        }
    }
]

Crappy Workaround:

The only way that I found to remove the target in the file is to send an update of the targetGroup.source with all the other parameter to nil. But it's creating an empty target group as below

[
    {
        "targets": [],
        "labels": {}
    },
    {
        "targets": [
            "192.168.1.12:9090"
        ],
        "labels": {
            "__address__": "192.168.1.12:9090",
            "__meta_application": "FOO",
            "__meta_source": "source1"
        }
    }
]
@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Sep 4, 2018

The only way that I found to remove the target in the file is to send an update of the targetGroup.source with all the other parameter to nil.

This is exactly what needs to be done and this is what integrated SD mechanisms do. See the file SD for instance:

// Send empty updates for sources that disappeared.
for f, n := range d.lastRefresh {
m, ok := ref[f]
if !ok || n > m {
level.Debug(d.logger).Log("msg", "file_sd refresh found file that should be removed", "file", f)
d.deleteTimestamp(f)
for i := m; i < n; i++ {
select {
case ch <- []*targetgroup.Group{{Source: fileSource(f, i)}}:
case <-ctx.Done():
return
}
}
}
}

@Nexucis

This comment has been minimized.

Copy link
Contributor Author

Nexucis commented Sep 4, 2018

Thanks for replying.

Does it mean that the custom_sd.json file will grow indefinitely everytime a source disappears ? Something like that :

[
    {
        "targets": [],
        "labels": {}
    },
    {
        "targets": [],
        "labels": {}
    },
    {
        "targets": [],
        "labels": {}
    },

[......]

    {
        "targets": [
            "192.168.1.12:9090"
        ],
        "labels": {
            "__address__": "192.168.1.12:9090",
            "__meta_application": "FOO",
            "__meta_source": "source1"
        }
    }
]

Would it be possible to create a function to unregister a source in the manager?

@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Sep 5, 2018

Indeed it might create several items with empty target lists if source varies. But there's nothing wrong with the manager, it should be fixed in the adapter code instead.

@Nexucis

This comment has been minimized.

Copy link
Contributor Author

Nexucis commented Sep 5, 2018

I'm not saying that the manager is wrong. I'm just saying, since the manager is responsible to keep in memory the different target retrieved, it should be the responsibility of the manager to unregister the target.

And it's with this logic that I create a light pull request to "fix" the bug. :)

@lock lock bot locked and limited conversation to collaborators Mar 26, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.