Skip to content

icklerly/assignment_11

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

#Technical Description

#Input Data all input files are stored inside the HDFS which is accessible at: hdfs://localhost:9000/users/icklerly/assignment11/Input

Matrix Completion Files:

- ./ALS/NaN_predict_Flink.csv
- ./ALS/NaN_train_Flink.csv

Machine Learning Files:

- ./ML/methylation_Flink.csv
- ./ML/mRNA_Flink.csv
- ./ML/mixed_Flink.csv
- ./ML/sparse_Flink.csv

Network File:

- ./mRNA_edges.csv

#Program The Jar "Assignment11_icklerly.jar" contains all methods:

Matrix Completion:

-c MatrixCompletion [input path: train] [input path: predict] [output path/fileName]

    /home/flink.0.10_old/bin/flink run -c MatrixCompletion
    /tmp/icklerly/try999.jar 
    hdfs://localhost:9000/users/icklerly/Assignment11/Input/ALS/NaN_train_Flink.csv 
    hdfs://localhost:9000/users/icklerly/Assignment11/Input/ALS/NaN_predict_Flink.csv 
    hdfs://localhost:9000/users/icklerly/Assignment11/Output/NaN_predicted.csv

Machine Learning:

-c ML [method: MLR, SVM] [data type: methylation, mRNA, mixed, sparse] [output path]

e.g.

    /home/flink.0.10_old/bin/flink run -c ML /tmp/icklerly/try999.jar 
    MLR
    sparse
    hdfs://localhost:9000/users/icklerly/Assignment11/Output

or

    /home/flink.0.10_old/bin/flink run -c ML /tmp/icklerly/try999.jar 
    SVM
    sparse
    hdfs://localhost:9000/users/icklerly/Assignment11/Output

Community Detection

-c Communitydetection [edge path] [output path/fileName] [num iterations] [delta]

e.g.

    /home/flink.0.10_old/bin/flink run -c Communitydetection /tmp/icklerly/try999.jar 30 0.5
    /users/icklerly/Assignment11/Input/Network/mRNA_edges.txt
    /users/icklerly/Assignment11/Output
    30
    0.5

About

Assignment 11 - Big Data in Life Sciences

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published