/
read_me.txt
28 lines (23 loc) · 4.25 KB
/
read_me.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
*****************************************
*| Owner: Rongxin(Coy) Zheng |*
*| Date: 2017/2/22 |*
*| Usage: projects display |*
*| Number of projects: 4 |*
*| Email: coycooperzheng@gmail.com |*
*| Phone number: 571-363-6785 |*
*****************************************
This is a Git repository of Coy's curriculum projects. Here is the list of projects by value order:
(Project 1 to Project 6 are the most valuable projects.)
1. The project of my Graduate Research Assisstant in Spring semester, 2017:
This project is aimed at building an online remote access weblab in which students can access and do the experiments online through their laptop or smart phone. I am the only person who is in charge of building Core Server, Laboratory Server and Experiment Server, using Python. Some of APIs are programmed by Java Script. Most of source codes are Python code. This is my first formal job, involving into a real big project. Absolutely, I did learn something from it, like how to search the relative article, like how to use Python (in order to finish it, I just started programming Python from Spring, 2017), like how to cooperate with other teammates, and so on.
2. This is a tricky project I've ever done. It's about predicting powerball:
During Fall, 2013, there was a boss finding me, and hired me to design a mathematical model for him to predict the outcome of Powerball. After I built up the model, using C++ at that time, I took about two month time to programmed for that. When I thought I had everything done in a right way, I presented to this boss. I got a relatively good result at that time, but the risk was too high. For example, although in 3-year-period(I used some data as training data, and the rest as testing data), the rate of return is pretty high, the boss wasn't willing to pay hundreds thousand dollar at once to take this risk. If we are not lucky enough and lost the most of money at first try, it will be very hard for us to survive in such a cruel game. Therefore, he asked me to do a risk avoiding modula, even it may reduce some revenue. However, after that, I came to The United States, and I didn't have too much time to revise it.
For here, I will only put part of my prediction data without any model declare and source code.
3. CS-504, Principles of Data Management and Mining, Fall, 2016:
The final project I did for this course is using Python to do KNN and CF analyzing, using the dataset from Grouplens. The dataset is about 20 million ratings and 465,000 tag applications applied to 27,000 movies by 138,000 users. For this project, I am not like most of other students who simply import the library of Numpy and scipy. Because the scale of dataset is really huge, without any constrains, just directly using the functions from the imported library will lead to a very long running time.
To fix this problem, I added many constrains in my code, and programmed every single part by myself own. It took me about only half an hour to read the data and build a Similarity matrix with all correct results, which is much more efficient than other students did even instructor. This result really impressed my instructor, he gave me a A+ at last. I not only improved the time complexity, but I also get the correct output.
This is a computer science course about machine learning. In this Course, I also learn how to use SQL in Oracle, NoSQL in MongoDB, and Clustering in Weka.
4. STAT-515, Applied Statistics & Visualization for Analytics, Fall, 2015: (My part is from page 12 to 29 in the report. The report is only 29 pages totally.)
The finaly project for this course is to focuse on foreign trade data to analyze various aspects of foreign trade for the past 15 years. Foreign trade is the official source for U.S. export and import statistics and responsible for issuing regulations governing the reporting of all export shipments from the United States (Census.gov).
Withing this project, I am in charge of Data Visualization, the most difficult part in this course. The models I built for this project are "Albers equal area conic projection" and "MicromapST", which can analyze the data altogether and all separately amount different specify states.
Again, page 12 to page 29 in the report is my part.