As per [this answer](https://www.kaggle.com/c/msk-redefining-cancer-treatment/discussion/39782#224266) from discussion forum, we can use the test labels from stage 1 for training.  
I've concatenated all former csv's into a single DataFrame for my own use, thought it would be helpfull for others as well.  

---
This notebook will read all data into memory, join relevant files on relevant id's, then output a stage2 train DataFrame which has `[3689, 4]` *(3321 training values + 368 stage 1 test values)* shape with `['Class', 'Gene', 'Text', 'Variation']` columns.  
  
*Since the relased test ID's and former training ID's conflict with each other, I've just reset the index on final training DataFrame to make them unique and cause less confusion. If you want to keep it as is you can skip cell 4.*

In [None]:
import pandas as pd 

Read train_variants, train_text and join them.

In [None]:
df_variants_train = pd.read_csv('../input/training_variants', usecols=['Gene', 'Variation', 'Class'])
df_text_train = pd.read_csv('../input/training_text', sep='\|\|', engine='python', 
                            skiprows=1, names=['ID', 'Text'])
df_variants_train['Text'] = df_text_train['Text']
df_train = df_variants_train

Read test_variants, test_text and join them.

In [None]:
df_variants_test = pd.read_csv('../input/test_variants', usecols=['ID', 'Gene', 'Variation'])
df_text_test = pd.read_csv('../input/test_text', sep='\|\|', engine='python', 
                           skiprows=1, names=['ID', 'Text'])
df_variants_test['Text'] = df_text_test['Text']
df_test = df_variants_test

Read stage_1_solutions and join the class values with test files.

In [None]:
# read stage1 solutions
df_labels_test = pd.read_csv('../input/stage1_solution_filtered.csv')
df_labels_test['Class'] = pd.to_numeric(df_labels_test.drop('ID', axis=1).idxmax(axis=1).str[5:])

# join with test_data on same indexes
df_test = df_test.merge(df_labels_test[['ID', 'Class']], on='ID', how='left').drop('ID', axis=1)
df_test = df_test[df_test['Class'].notnull()]

# join train and test files
df_stage_2_train = pd.concat([df_train, df_test])

Reset training index to a range for readability (you can skip this)

In [None]:
df_stage_2_train.reset_index(drop=True, inplace=True)

Resulting dataframe can be used as stage 2 training file with  `[3689, 4] ` shape.

In [None]:
df_stage_2_train.info()

Alternatively you can use the `df_train`, `df_test` as your training and validation sets.  
Class distribution in `df_test` (stage 1 results) seems stratified.

In [None]:
pd.concat([df_test['Class'].value_counts(), df_stage_2_train['Class'].value_counts()], 
          axis=1, keys=['Test', 'All'])