Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor job scheduler to use memdb for jobs #782

Merged
merged 8 commits into from
Feb 24, 2022

Conversation

radeksimko
Copy link
Member

@radeksimko radeksimko commented Feb 3, 2022

This represents 2nd stage of the refactoring per #719

Depends on #771

Closes #719
Closes #768
Closes #775
Closes #800


Why

As described in #719 the existing logic is hard to test and requires time.Sleep in cases where dependent pieces of work are scheduled, i.e. synchronization is difficult/non-existent. This leads to a number of flakey tests, making testing of any other bug or feature more difficult.

Lack of synchronization also causes e.g. completion to use outdated data, as reported in #768 or #775 due to textDocument/completion running before textDocument/didChange had the chance to finish applying any changes to the document and re-parse anything.

New job package

This is to make enqueueing and consuming jobs from various places easier without running into import cycles.

type Job struct {
	// Func represents the job to execute
	Func func(ctx context.Context) error

	// Dir describes the directory which the job belongs to,
	// which is used for deduplication of queued jobs (along with Type)
	// and prioritization
	Dir document.DirHandle

	// Type describes type of the job (e.g. GetTerraformVersion),
	// which is used for deduplication of queued jobs along with Dir.
	Type string

	// Defer is a function to execute after Func is executed
	// and before the job is marked as done (StateDone).
	// This can be used to schedule jobs dependent on the main job.
	Defer DeferFunc
}

// DeferFunc represents a deferred function scheduling more jobs
// based on jobErr (any error returned from the main job).
// Newly queued job IDs should be returned to allow for synchronization.
type DeferFunc func(ctx context.Context, jobErr error) IDs

Example

id, err = w.jobStore.EnqueueJob(job.Job{
	Dir: modHandle,
	Func: func(ctx context.Context) error {
		return ParseModuleManifest(w.fs, w.modStore, dir)
	},
	Type:  op.OpTypeParseModuleManifest.String(),
	Defer: decodeCalledModulesFunc(w.fs, w.modStore, w.schemaStore, w.watcher, dir),
})

New jobs memdb table

A new memdb table was created to store queued/running jobs and to provide synchronization via watch channels in memdb.

type ScheduledJob struct {
	job.ID
	job.Job
	IsDirOpen bool
	State     State

	// JobErr contains error when job finishes (State = StateDone)
	JobErr error
	// DeferredJobIDs contains IDs of any deferred jobs
	// set when job finishes (State = StateDone)
	DeferredJobIDs job.IDs
}

New general-purpose scheduler (package)

Scheduling was previously done via moduleLoader, essentially implementation detail of ModuleManager, making everything depend on ModuleManager. moduleLoader internally also used priority queue implemented via container/heap. Queue had to be resorted on every insertion (even though no sorting took place most of the time, unless user opened/closed affected files). De-duplication was done via individual Module fields indicating the state of the operation, making the whole logic very verbose and error prone.

if operationState(mod, modOp.Type) == op.OpStateQueued {
// avoid enqueuing duplicate operation
modOp.markAsDone()
return nil
}
switch modOp.Type {
case op.OpTypeGetTerraformVersion:
ml.modStore.SetTerraformVersionState(modOp.ModulePath, op.OpStateQueued)
case op.OpTypeObtainSchema:
ml.modStore.SetProviderSchemaState(modOp.ModulePath, op.OpStateQueued)
case op.OpTypeParseModuleConfiguration:
ml.modStore.SetModuleParsingState(modOp.ModulePath, op.OpStateQueued)
case op.OpTypeParseVariables:
ml.modStore.SetVarsParsingState(modOp.ModulePath, op.OpStateQueued)
case op.OpTypeParseModuleManifest:
ml.modStore.SetModManifestState(modOp.ModulePath, op.OpStateQueued)
case op.OpTypeLoadModuleMetadata:
ml.modStore.SetMetaState(modOp.ModulePath, op.OpStateQueued)
case op.OpTypeDecodeReferenceTargets:
ml.modStore.SetReferenceTargetsState(modOp.ModulePath, op.OpStateQueued)
case op.OpTypeDecodeReferenceOrigins:
ml.modStore.SetReferenceOriginsState(modOp.ModulePath, op.OpStateQueued)
case op.OpTypeDecodeVarsReferences:
ml.modStore.SetVarsReferenceOriginsState(modOp.ModulePath, op.OpStateQueued)
}

Instead each pending job now has its own entry in the new jobs memdb table, allowing for more efficient querying when checking for duplicate jobs and distinguishing between open and closed directory.

Relatedly, a (mostly) general purpose scheduler was created to simply execute arbitrary Funcs above Dirs and possibly execute any Defered functions (dependent jobs).

Removal of module manager

Module manager is no more as the individual responsibilities were split between scheduler + jobs memdb table.

@radeksimko radeksimko marked this pull request as ready for review February 4, 2022 19:12
@radeksimko radeksimko force-pushed the f-jobs-memdb-refactoring branch 11 times, most recently from 9a89195 to 9223958 Compare February 11, 2022 15:07
@radeksimko radeksimko requested a review from a team February 11, 2022 16:55
@radeksimko radeksimko force-pushed the f-jobs-memdb-refactoring branch 4 times, most recently from 5da77d4 to ed4aa4d Compare February 18, 2022 08:22
In the old test we would be checking what *all* schemas are available for a given path after indexing via walker.

This part of API is however being deprecated in favour of more straightforward one (see state.ProviderSchemaStore -> ProviderSchema()) which picks the single most appropriate schema from a list of candidates and retains the decision logic as an implementation detail within it, so the whole list of candidates is not really available from the outside (by design), hence not something we should test from the outside.

ProviderSchemaStore itself already has plenty of tests testing the decision logic within, which is better place for testing this anyway.
Copy link
Contributor

@jpogran jpogran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works on my machine!

@radeksimko radeksimko added this to the v0.26.0 milestone Feb 23, 2022
Copy link
Member

@dbanck dbanck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work!

🚢 !

@github-actions
Copy link

This functionality has been released in v0.26.0 of the language server.
If you use the official Terraform VS Code extension, it will prompt you to upgrade to this version automatically upon next launch or within the next 24 hours.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
3 participants