Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorg logical flow in train #37

Merged
merged 38 commits into from May 22, 2019

Conversation

Projects
None yet
3 participants
@chengfx
Copy link
Contributor

commented May 11, 2019

I have reorged the logical flow in train.py in order to better readability, scalability and robustness. It is the first work to add encoding cache mechanism. I have tested the regression & classification & Chinese tasks locally and still need more tests from folks 馃槃 because there are lots of changes in this PR.
current logical flow is

  1. Init
    1. Conf
    2. Problem
    3. other
      1. finetune
  2. Cache verification (Use cache == True)
    1. check
      1. Cache conf
      2. Problem
      3. Embedding
      4. Encoding
    2. load
  3. Data preprocessing (Use cache == False or cache verification == false)
    1. build dictionary
    2. encoding
  4. Environment Preparing
    1. Cache Save (Use cache == True)
      1. create dir
      2. conf
      3. problem
      4. embedding
      5. encoding
    2. Backup
  5. Train phase
    1. init
      1. model
      2. loss
      3. optimizer
    2. train
    3. test

Feixiang Cheng and others added some commits Apr 25, 2019

@ljshou

This comment has been minimized.

Copy link
Member

commented May 11, 2019

馃惍杈涜嫤


self.input_dicts = dict()
# init

This comment has been minimized.

Copy link
@woailaosang

woailaosang May 14, 2019

Contributor

readability is not good, I think. Strongly suggest enumerate every task.

This comment has been minimized.

Copy link
@chengfx

chengfx May 15, 2019

Author Contributor

Hi, @woailaosang Readability is just special for train logical flow here. I made less works on other modules. Yeah, enumerate every task is a good idea . But I think we need to optimize the code to reduce the repeated codes and logic. Otherwise you have to modify every place if you want to make some changes

@chengfx chengfx requested a review from woailaosang May 15, 2019

@@ -2,5 +2,7 @@
*~
*.pyc
*.cache*
*.vs*

This comment has been minimized.

Copy link
@woailaosang

woailaosang May 15, 2019

Contributor

what's the directory of '.vs'?

This comment has been minimized.

Copy link
@chengfx

chengfx May 16, 2019

Author Contributor

it's just the configuration of vs code, not related to this project 馃槃

train.py Outdated
vocab_info, initialize = None, False
if not conf.pretrained_model_path:
vocab_info, initialize = get_vocab_info(problem, emb_matrix), True
print(initialize)

This comment has been minimized.

Copy link
@woailaosang

woailaosang May 16, 2019

Contributor

This 'print' line needs to be deleted, I think.

train.py Outdated
# first time training, load problem from cache, and then backup the cache to model_save_dir/.necessary_cache/
if conf.use_cache and os.path.isfile(conf.problem_path):
def load(self, conf, problem, emb_matrix):
# load dictionary when (not finetune) and (cache invalid)

This comment has been minimized.

Copy link
@woailaosang

woailaosang May 16, 2019

Contributor

load dictionary when (not finetune) and (cache valid)

This comment has been minimized.

Copy link
@chengfx

chengfx May 18, 2019

Author Contributor

thanks, done

@chengfx chengfx requested a review from woailaosang May 20, 2019

@ljshou

This comment has been minimized.

Copy link
Member

commented May 20, 2019

there are conflict on these files: Conflicting files
problem.py
train.py

@chengfx

This comment has been minimized.

Copy link
Contributor Author

commented May 22, 2019

there are conflict on these files: Conflicting files
problem.py
train.py

done

@ljshou ljshou merged commit dc013c3 into master May 22, 2019

1 check passed

license/cla All CLA requirements met.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can鈥檛 perform that action at this time.