Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: (cstor-volume-mgmt) Add support to update cstorvolume status with known replicas #1447

Merged
merged 16 commits into from
Oct 1, 2019

Conversation

mittachaitu
Copy link

@mittachaitu mittachaitu commented Sep 19, 2019

What this PR does / why we need it:
This PR does the following changes:

  1. Add support to update trusty replica information in cstorvolume status to make replica information persist in etcd.
  2. Add support to trigger istgtcontrol drf from volume-mgmt on success updates istgt.conf file(target configuration file).
    Ex: command istgtcontrol drf <vol_name> <value>
    To implement this cstor-volume-mgmt starts the new UnixDomainServer and cstor-istgt(main car) connects to it and update the information related to known replicas and configurations.

Example YAML:

spec:
  capacity: 5G
  consistencyFactor: 1
  desiredReplicationFactor: 1
  iqn: iqn.2016-09.com.openebs.cstor:pvc-a2a53a14-1cc4-4bdd-ad4b-ab88e07d7823
  nodeBase: iqn.2016-09.com.openebs.cstor
  replicationFactor: 1
  status: Init
  targetIP: 10.0.0.28
  targetPort: "3260"
  targetPortal: 10.0.0.28:3260
status:
  capacity: 5G
  lastTransitionTime: "2019-09-27T08:31:54Z"
  lastUpdateTime: "2019-09-27T08:33:53Z"
  phase: Healthy
  replicaDetails: {}
  replicaStatuses:
  - checkpointedIOSeq: "0"
    inflightRead: "0"
    inflightSync: "0"
    inflightWrite: "0"
    mode: Healthy
    quorum: "1"
    replicaId: "16362917944657237302"
    upTime: 147

Below two fields are introduced in the schema as part of replica scaleup support.
replicaDetails -- replicas present in this list are trusty replicas from where the target can perform read operations.
desiredReplicationFactor -- It is useful to scale up replicas from n to m (where m <= 5) by default it will be equal to replicationFactor.

For more information, please follow the doc: openebs/openebs#2739

Special notes for your reviewer:
TODO:

  1. Need to test this code after fixing istgt travis issue.
  2. UnitTest cases for this PR changes.

Checklist:

  • Fixes #
  • Labelled this PR & related issue with documentation tag
  • PR messages has document related information
  • Labelled this PR & related issue with breaking-changes tag
  • PR messages has breaking changes related information
  • Labelled this PR & related issue with requires-upgrade tag
  • PR messages has upgrade related information
  • Commit has unit tests
  • Commit has integration tests

pkg/cstor/volume/targetserver/server.go Outdated Show resolved Hide resolved
pkg/cstor/volume/targetserver/server.go Outdated Show resolved Hide resolved
pkg/cstor/volume/targetserver/server.go Outdated Show resolved Hide resolved
pkg/cstor/volume/targetserver/server.go Outdated Show resolved Hide resolved
pkg/cstor/volume/targetserver/server.go Outdated Show resolved Hide resolved
pkg/cstor/volume/targetserver/server.go Outdated Show resolved Hide resolved
pkg/cstor/volume/targetserver/server.go Outdated Show resolved Hide resolved
pkg/cstor/volume/targetserver/server.go Outdated Show resolved Hide resolved
mittachaitu added 4 commits September 19, 2019 16:43
Signed-off-by: mittachaitu <sai.chaithanya@mayadata.io>
Signed-off-by: mittachaitu <sai.chaithanya@mayadata.io>
Signed-off-by: mittachaitu <sai.chaithanya@mayadata.io>
Signed-off-by: mittachaitu <sai.chaithanya@mayadata.io>
Signed-off-by: mittachaitu <sai.chaithanya@mayadata.io>
@mittachaitu mittachaitu added the pr/release-note PR should be included in release notes label Sep 19, 2019
@mittachaitu mittachaitu self-assigned this Sep 19, 2019
func StartTargetServer(kubeConfig string) {

klog.Info("Starting unix domain server")
if err := os.RemoveAll(string(volumeMgmtUnixSock)); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why to remove?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before starting server on that sock file clearing the sock path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not required

if n > 0 {
completeBytes = append(completeBytes, buf[0:n]...)
if strings.HasSuffix(string(completeBytes), endOfLine) {
lines := strings.Split(string(completeBytes), endOfLine)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why to Split and why to append?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This piece of code needs fix.. please put the communication protocol in comments

pkg/cstor/volume/targetserver/server.go Outdated Show resolved Hide resolved
}
if n > 0 {
completeBytes = append(completeBytes, buf[0:n]...)
if strings.HasSuffix(string(completeBytes), endOfLine) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if completedBytes contains multiple lines?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, add comment here about communication protocol.

pkg/cstor/volume/targetserver/server.go Outdated Show resolved Hide resolved
}

// GetRequiredData returns error if doesn't have json format
func GetRequiredData(data string) (string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though comment is there, Let's change this function name something like GetJsonData..

lines := strings.Split(string(completeBytes), endOfLine)
for _, line := range lines {
if len(line) != 0 {
req = append(req, line+endOfLine)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove endOfLine

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made into return lines[0]

klog.Errorf("failed to read data: {%v}", err)
return
}
filteredData, err = GetRequiredData(readData)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to do this. You can execute unmarshal on readData.

for {
sockFd, err := listen.Accept()
if err != nil {
klog.Fatalf("failed to accept error: {%v}", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need of Fatalf here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not fatal, we need to handle it then by starting server again

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was for old code.

pkg/cstor/volume/targetserver/server.go Outdated Show resolved Hide resolved
@mittachaitu mittachaitu changed the title chore: (cstor-volume-mgmt) Add support to update cstorvolume status with known replicas [WIP]chore: (cstor-volume-mgmt) Add support to update cstorvolume status with known replicas Sep 25, 2019
1. Included executing istgtcontrol drf command
2. Updating istgt conf file by taking lock
3. Handiling to do part as part of server to start

Signed-off-by: mittachaitu <sai.chaithanya@mayadata.io>
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found some fixes!

P.S. share your ideas, feedbacks or issues with us at https://github.com/fixmie/feedback (this message will be removed after the beta stage).

cmd/cstor-volume-mgmt/volume/volume.go Show resolved Hide resolved
pkg/apis/openebs.io/v1alpha1/cstor_volume.go Show resolved Hide resolved
pkg/cstor/volume/targetserver/server.go Outdated Show resolved Hide resolved
@mittachaitu mittachaitu changed the title [WIP]chore: (cstor-volume-mgmt) Add support to update cstorvolume status with known replicas chore: (cstor-volume-mgmt) Add support to update cstorvolume status with known replicas Sep 26, 2019
Signed-off-by: mittachaitu <sai.chaithanya@mayadata.io>
Signed-off-by: mittachaitu <sai.chaithanya@mayadata.io>
mittachaitu added 2 commits September 27, 2019 17:14
Signed-off-by: mittachaitu <sai.chaithanya@mayadata.io>
Signed-off-by: mittachaitu <sai.chaithanya@mayadata.io>
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found some fixes!

P.S. share your ideas, feedbacks or issues with us at https://github.com/fixmie/feedback (this message will be removed after the beta stage).

pkg/cstor/volume/targetserver/server.go Outdated Show resolved Hide resolved
@@ -214,7 +233,9 @@ func (c *CStorVolumeController) cStorVolumeEventHandler(
cStorVolumeGot.Status.LastTransitionTime = cStorVolumeGot.Status.LastUpdateTime
}

cStorVolumeGot.Status.ReplicaStatuses = volStatus.ReplicaStatuses
if volStatus != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for this check

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be required if errror occurred for istgtcontrol replica status we are not returning(I don't know the reason) so instead of returning I made this check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

volStatus should be validated after volume.GetVolumeStatus.

volStatus, err := volume.GetVolumeStatus(cStorVolumeGot)
		if err != nil || volStatus == nil {

Let's add an issue in hacktoberfest to fix this cmd/cstor-volume-mgmt/volume.GetVolumeStatus.

}

sigc := make(chan os.Signal, 1)
signal.Notify(sigc, os.Interrupt, syscall.SIGTERM)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why for Interrupt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we consider SIGKILL?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or you better remove all the code related to signals

Copy link
Author

@mittachaitu mittachaitu Sep 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If remove the code related to signals then we are not closing the listener connection which will be the problem for the next listener(i.e when the container starts again) it might throw an error address already in use.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed signal code os.RemoveAll kept as is

for {
sockFd, err := listen.Accept()
if err != nil {
klog.Errorf("failed to accept error: {%v}", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do a break from here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for anything other than EINTR, EAGAIN, EWOULDBLOCK, do a break

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I need to exit the process, isn't it?

mittachaitu added 2 commits September 30, 2019 13:46
Signed-off-by: mittachaitu <sai.chaithanya@mayadata.io>
Signed-off-by: mittachaitu <sai.chaithanya@mayadata.io>
c.recorder.Event(
cStorVolumeGot, corev1.EventTypeWarning,
string(common.FailureCreate),
fmt.Sprintf("failed to create cstorvolume validation "+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's change it to failed to validate cstorvolume.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not failed to validate we failed during validation I think current looks good

@@ -214,7 +233,9 @@ func (c *CStorVolumeController) cStorVolumeEventHandler(
cStorVolumeGot.Status.LastTransitionTime = cStorVolumeGot.Status.LastUpdateTime
}

cStorVolumeGot.Status.ReplicaStatuses = volStatus.ReplicaStatuses
if volStatus != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

volStatus should be validated after volume.GetVolumeStatus.

volStatus, err := volume.GetVolumeStatus(cStorVolumeGot)
		if err != nil || volStatus == nil {

Let's add an issue in hacktoberfest to fix this cmd/cstor-volume-mgmt/volume.GetVolumeStatus.

Signed-off-by: mittachaitu <sai.chaithanya@mayadata.io>
n, err := r.Read(buf[:])
if err != nil {
if err == io.EOF {
klog.Info("Reached End Of file")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'break' here makes to return error.. do we return error in such case?
And, is this not the regular case? if so, log may not be good to have here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EOF will occur only when client closes the connection and here we are expecting \r\n as end of data as mentioned in comments. If data doesn't have \r\n better to error out it. I will remove this log.

klog.Errorf("failed to read data: {%v}", err)
return
}
replicationData := &cstorv1alpha1.CVReplicationDetails{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as it doesn't contain all the details, instead of CVReplicationDetails, how about CVReplicationInfo..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In discussion I heard that we need to accept entire trusty replica list and perform update in etcd so I had structure name as CVReplicationDetails

klog.Errorf("failed to unmarshal replication data {%v}", err)
return
}
csc := &cstorv1alpha1.CStorVolumeConfig{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets try to avoid creating one more struct just to pass kubeclient

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will take it as a good practice. I will pass as a functional arguments

}

// Validate verifies whether CStorReplication data read on wire is valid or not
func (csr *CVReplicationDetails) Validate() error {
Copy link
Contributor

@vishnuitta vishnuitta Sep 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

variable names are changing everywhere.. here why cvr csr?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used csr everywhere --> cstorreplication

if cv.Status.ReplicaDetails.KnownReplicas == nil {
cv.Status.ReplicaDetails.KnownReplicas = map[string]uint64{}
}
cv.Status.ReplicaDetails.KnownReplicas[csc.ReplicaID] = csc.ReplicaGUID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log the info that you are updating

if err != nil {
return errors.Wrapf(err, "failed to get cstorvolume")
}
if len(cv.Status.ReplicaDetails.KnownReplicas) >= cv.Spec.DesiredReplicationFactor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replica replacement case will get error here
Also, duplicate updates also will get error here, like, if update to CR succeeded but not returned success to istgt.

mittachaitu added 2 commits September 30, 2019 21:13
Signed-off-by: mittachaitu <sai.chaithanya@mayadata.io>
Signed-off-by: mittachaitu <sai.chaithanya@mayadata.io>
if csr.VolumeName == "" {
return errors.Errorf("volume name can not be empty")
}
if csr.ReplicaKey == "" {
if csr.ReplicaID == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use length to check if string is empty.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am slowly adapting some standard practice in go by looking into core files in standard package for the string to verify whether it is empty or the will do if str == "" {//operation}. If it is map or array or slice they are using len() so I made use of it.

Signed-off-by: mittachaitu <sai.chaithanya@mayadata.io>
Copy link
Contributor

@vishnuitta vishnuitta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes are good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/release-note PR should be included in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants