Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

joint_inference: bug fix and interface reconstruction #117

Merged
merged 2 commits into from
Aug 6, 2021

Conversation

JoeyHwong-gk
Copy link
Contributor

@JoeyHwong-gk JoeyHwong-gk commented Jul 7, 2021

  1. fix example bug: save result which get from cloud if is hard example
  2. Add docs and code comment
  3. rename: TSBigModelService -> BigModelService
  4. fix [Joint Inference] edge worker needs to fall back the available state when cannot connect cloud worker #133

@kubeedge-bot kubeedge-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 7, 2021
@kubeedge-bot kubeedge-bot added the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Jul 20, 2021
@kubeedge-bot kubeedge-bot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Jul 21, 2021
@JoeyHwong-gk JoeyHwong-gk changed the title Code Maintenance:joint_inference joint_inference: bug fix and interface reconstruction Jul 21, 2021
image_paded = image_paded / 255.

preprocessed_data = image_paded.astype(np.float32)[np.newaxis, :]
return preprocessed_data, org_img_shape
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unify org_img_shape and ori_img_shape

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

@@ -12,12 +12,12 @@
# See the License for the specific language governing permissions and
# limitations under the License.

import json
import time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix import rank

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

@JoeyHwong-gk
Copy link
Contributor Author

/assign @TymonXie

return ious


def nms(bboxes, iou_threshold, sigma=0.3, method='nms'):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is mean by nms?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved in below comments

pred_conf = pred_bbox[:, 4]
pred_prob = pred_bbox[:, 5:]

# (1) (x, y, w, h) --> (xmin, ymin, xmax, ymax)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please put more information about this note

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

@ugvddm
Copy link

ugvddm commented Aug 3, 2021

/assign @jaypume

@JoeyHwong-gk
Copy link
Contributor Author

/help @llhuii

cv2.putText(img_copy, text, (p1[1], p1[0] + 20 * (label + 1)),
cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 0, 0),
text_thickness, line_type)
except TypeError as err:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the comment to explain that captures the TypeError here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix


def inference(self, data=None, post_process=None, **kwargs):
"""
Inference task for IncrementalLearning
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incremental in joint_inference file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

:param data: inference sample
:param post_process: post process
:param kwargs: params for inference of customize estimator
:return: if is hard sample, real result,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just return the final result?

inference_result = InferenceResult(
*inference_instance.inference(img_rgb))
output_deal(inference_result, nframe, img_rgb)
is_hard_example, final_result, edge_result, cloud_result = (
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the user view, he may only care about the final inference result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a discussion before, user may need to compare the difference in the inference results of the big/little model. In this example, the user keeps all hard samples at the same time, and is considered to be able to find out any samples with low-confidence, for post-processing

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to consider the users that don't care the image is hard or not.

@ugvddm
Copy link

ugvddm commented Aug 6, 2021

/lgtm

@kubeedge-bot kubeedge-bot added the lgtm Indicates that a PR is ready to be merged. label Aug 6, 2021
@llhuii
Copy link

llhuii commented Aug 6, 2021

/hold
please squash the latest three commits
image

@kubeedge-bot kubeedge-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 6, 2021
@kubeedge-bot kubeedge-bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 6, 2021
@kubeedge-bot kubeedge-bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 6, 2021
fix message when http connect fail

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>
- make the hard_example_mining alg to be a common interface
- fix get_hem_from_config: raise exception when value is unexpected

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>
@kubeedge-bot kubeedge-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 6, 2021
@llhuii
Copy link

llhuii commented Aug 6, 2021

/hold cancel

@kubeedge-bot kubeedge-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 6, 2021
@llhuii
Copy link

llhuii commented Aug 6, 2021

/approve

@kubeedge-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: llhuii

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubeedge-bot kubeedge-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 6, 2021
@llhuii
Copy link

llhuii commented Aug 6, 2021

/lgtm

/lgtm

@kubeedge-bot kubeedge-bot added the lgtm Indicates that a PR is ready to be merged. label Aug 6, 2021
@kubeedge-bot kubeedge-bot merged commit 57be75c into kubeedge:main Aug 6, 2021
@JoeyHwong-gk JoeyHwong-gk deleted the joint branch August 13, 2021 03:10
vcozzolino added a commit to vcozzolino/sedna that referenced this pull request Dec 3, 2021
Upgrade to v0.4.0

Created-by: Vittorio Cozzolino 00609018
Author-id: 553076
MR-id: 12361350
Commit-by: Vittorio Cozzolino;KubeEdge Bot;JoeyHwong;JimmyYang20;ShiXiaohou;DanLiu;HenryChou;Yutong Wang;EnfangCui;llhuii;XinYao1994;Jie Pu;wei.ji
Merged-by: Vittorio Cozzolino 00609018
E2E-issues: 
Description:
fix incremental_learning bug
- Add docs and code comment
- fix bugs: epoch always be 1, inference result not saved, s3 upload
  fail

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
Interface Improvement:
   1. The algorithm of HardExampleMining should be seleted by the developer.

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
install.sh: fix LC_BIND_PORT bug

rename the variable LC_BIND_PORT to SENDA_LC_BIND_PORT.

Signed-off-by: llhuii <liulinghui@huawei.com>,
Merge pull request kubeedge#121 from llhuii/fix-install-script-bug

install.sh: fix LC_BIND_PORT bug,
Update interface.py

fix env missing bug

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
fix bug: aggregation of weights should occur in the AggServer

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
fix bug: Cloud worker not exiting

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
gm: refactor all features into independent dir

All controllers are placed into globalmanager/controllers:
1. each feature has the independent subdirectory
2. upstream/downstream are kept as top level.

Commom types/utils/worker.go are placed into globalmanager/runtime.

Signed-off-by: llhuii <liulinghui@huawei.com>,
fix PR comment
- clean useless code
- catch server exception in threads

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
gm: refactor upstream controller

Split upstream controller, merge each feature CR logic code into its
controller.

Signed-off-by: llhuii <liulinghui@huawei.com>,
gm: share client/Informer with all controllers

Make all controllers sharing with:
1. kubernetes client, and informerFactory with random resync period.
2. sedna crd client, and informerFactory with random resync period.

This can reduce code and improve slim performance.

Signed-off-by: llhuii <liulinghui@huawei.com>,
gm: add dataset controller

Only handle dataset update from edge.

Signed-off-by: llhuii <liulinghui@huawei.com>,
Merge pull request kubeedge#106 from JoeyHwong-gk/federated

fix federated learning bugs,
gm: split all upstream logic into separate file

Signed-off-by: llhuii <liulinghui@huawei.com>,
gm: split all downstream logic into separate file

Since all CR watch actions are placed into corresponding controller,
controllers/downstream.go is unnecessary.

Signed-off-by: llhuii <liulinghui@huawei.com>,
LC: fix nil pointer dereference bug

It happened in evalTask of incremental job when deploy model hasn't
been synced to LC. evalTask should return error instead of logging
error. And it doesn't need job id info into error, same as trainTask.

Signed-off-by: JimmyYang20 <yangjin39@huawei.com>,
Merge pull request kubeedge#139 from JimmyYang20/fixbug

LC: fix nil pointer dereference bug,
LC: send dataset update to GM only when changed

number of samples has been sent to GM only when adding new data.

Signed-off-by: JimmyYang20 <yangjin39@huawei.com>,
Merge pull request kubeedge#137 from JimmyYang20/main

LC: send dataset update to GM only when changed,
lifelong learning s3 support
- fix file_ops method
- fix kb save bug

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
Add object search and tracking docs to docs/proposals/

Add object search and tracking crd samples to build/crd-samples/sedna/

Add object search and tracking type.go files to pkg/apis/sedna/v1alpha1/

Signed-off-by: EnfangCui <17111008@bjtu.edu.cn>,
Merge pull request kubeedge#100 from EnfangCui/add-multi-edge-inference-PR

Add object search and tracking proposals,
gm: more code clean after initial refactor done

1. remove the feature redundant name in all feature controllers(e.g.
'federatedlearningJob' to 'job'), since it has already own independent
package, no need the feature extra name
2. upstream interface optimizaztion
3. fix empty Kind of all CR in downstream
4. add extra doc string
5. fix code style

Signed-off-by: llhuii <liulinghui@huawei.com>,
Fix the problem that kbimage cannot be compiled in Makefile

Signed-off-by: wei.ji <wei.ji@easystack.cn>,
improve lifelong learning docs

1. improve the atc example words
2. fix the broken links in lifelong proposal

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
Merge pull request kubeedge#146 from Jw-Jm/main

Fix the problem that kbimage cannot be compiled in Makefile,
make the hard_example_mining alg to be a common interface

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
Merge pull request kubeedge#134 from llhuii/refactor-gm

gm: decouple all features into independent package,
Merge pull request kubeedge#107 from JoeyHwong-gk/incremental

[incremental learning] example:keep all  results whether is hardExample or not, fixed the issue of using s3 to save model,
Merge pull request kubeedge#143 from JoeyHwong-gk/lldoc

improve lifelong learning docs,
fix example bug: save result which get from cloud if is hard example

fix message when http connect fail

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
fix pr comment
- make the hard_example_mining alg to be a common interface
- fix get_hem_from_config: raise exception when value is unexpected

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
lc: decouple all features into independent package

Signed-off-by: JimmyYang20 <yangjin39@huawei.com>,
Merge pull request kubeedge#117 from JoeyHwong-gk/joint

joint_inference: bug fix and interface reconstruction,
Merge pull request kubeedge#149 from JimmyYang20/refector-lc

lc: decouple all features into independent package,
fix lifelong issue
- Reduce parameters for initial
- show all interfaces of lifelong learning in example
- fix bugs from deep learning framework

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
fix il doc

Signed-off-by: JimmyYang20 <yangjin39@huawei.com>,
Merge pull request kubeedge#153 from JimmyYang20/fix-doc

Fix rendering issue of example doc in readthedocs,
Merge pull request kubeedge#142 from JoeyHwong-gk/lls3

lifelong learning: issue-driven interface-adjustment and bug fix,
fix the lifelong example problem from backend and constant

- fix sklearn backend: support args in train/eval/infer
- fix lifelong constant

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
Automatic push images when publishing a release

A github action is added for pushing image when a new release is created:
1. login docker hub.
2. checkout the project, and run `make push-all`.

Signed-off-by: llhuii <liulinghui@huawei.com>,
Merge pull request kubeedge#154 from JoeyHwong-gk/lifelong

[Lifelong example]: fix the problem from backend and constant,
docs: update install guide

1. add GM/LC links
2. add GM/LC deploy form

Signed-off-by: llhuii <liulinghui@huawei.com>,
Merge pull request kubeedge#155 from llhuii/add-image-push-gh-action

Push images automatically when a new release is created,
Merge pull request kubeedge#156 from llhuii/update-install-doc

docs: update install guide,
IL: LC supports to recover job when restart

Signed-off-by: JimmyYang20 <yangjin39@huawei.com>,
Merge pull request kubeedge#152 from JimmyYang20/fixbug

IL: LC supports to recover job when restart,
Fix IMAGE_REPO in github image-publish action

Using the env 'GITHUB_REPOSITORY' instead of 'GITHUB_ACTOR' to get the right
image repo name i.e. `IMAGE_REPO` in Makefile.

Signed-off-by: llhuii <liulinghui@huawei.com>,
add lib doc

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
Improve the docs

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
fix syntax and information in the docs
Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
update lib doc

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
Update s3 example docs of IL&JI

Signed-off-by: JimmyYang20 <yangjin39@huawei.com>,
Support websocket reconnection

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
Lib support hot model update

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
Adjusting the Log of IncrementalLearning example

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
Fix codegen verify checker

Note the codegen verify checker should report error

Signed-off-by: llhuii <liulinghui@huawei.com>,
Add the missing gencode for objectsearch/tracking

Signed-off-by: llhuii <liulinghui@huawei.com>,
fix job_kind value in LC_report

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
Update dependency of server request in lib
- replace `retry==1.3.3` with `tenacity==8.0.1` because of `retry` no longer maintained.

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
Merge pull request kubeedge#158 from llhuii/fix-imagerepo-of-image-publish-action

Fix IMAGE_REPO in github image-publish action,
Merge pull request kubeedge#164 from JoeyHwong-gk/federated

Support websocket reconnection when the server status is abnormal,
Merge pull request kubeedge#166 from llhuii/fix-verify-checker

Fix codegen verify checker,
Add contributing docs

Signed-off-by: llhuii <liulinghui@huawei.com>,
Merge pull request kubeedge#159 from JoeyHwong-gk/libdoc

update lib doc,
Merge pull request kubeedge#148 from llhuii/add-contributing-docs

Add contributing docs,
Merge pull request kubeedge#160 from JimmyYang20/doc-s3

Update s3 example docs of IL&JI,
fix access exceptions when rendering with sphinx

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
Merge pull request kubeedge#150 from JoeyHwong-gk/docs

docs improvement,
GM&LC: IL supports model hot updates

Signed-off-by: JimmyYang20 <yangjin39@huawei.com>,
fix pr comment

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
Merge pull request kubeedge#138 from JimmyYang20/modelhotupdate

GM&LC: IL supports model hot updates,
Fix s3 example docs of IL&JI

Signed-off-by: JimmyYang20 <yangjin39@huawei.com>,
Merge pull request kubeedge#174 from JimmyYang20/doc-s3

Fix s3 example docs of IL&JI,
Merge pull request kubeedge#157 from JoeyHwong-gk/hot_model

[Lib Support] hot model update,
Upgrade gorilla/websocket from v1.4.0 to v1.4.2

This upgrade fixes a potential DoS vector bug in gorilla/websocket 1.4.0,
see GHSA-jf24-p9p9-4rjh

Signed-off-by: llhuii <liulinghui@huawei.com>,
Merge pull request kubeedge#182 from llhuii/upgrade-websocket

Upgrade gorilla/websocket from v1.4.0 to v1.4.2,
fix lib/requirements
- This upgrade fixes a CSRF error in FastAPI version earlier than 0.65.2,
see GHSA-8h2j-cgx8-6xv7

Signed-off-by: JoeyHwong <joeyhwong@gknow.cn>,
Merge pull request kubeedge#183

See merge request butterfly/sedna!9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Joint Inference] edge worker needs to fall back the available state when cannot connect cloud worker
6 participants