# A Short Walkthrough of the Social Trading Data


## Before you continue to read

This document is written as a `Jupyter` notebook, which contains both code and text. The best thing is, you can run the code in the file!(provided you have the data and have `Jupyter` installed) 

If you only want to read this document, click on the `html` file, but if you want to excute the R code in the code block, you have to install `Jupyter` notebook.

I use the following packages intensively in my daily data processing, of course, also in this project. If you want to run the following code block, install them by executing the next code block line by line. There may be hundreds of lines of log, but be assured that it's safe to ignore them, unless there is an `error`.

In [5]:
install.packages("devtools")
library(devtools)
install_github("xiaomowu/utilr")

library(utilr)

"unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/3.5:
"package 'devtools' is in use and will not be installed"Skipping install of 'utilr' from a github remote, the SHA1 (2d3790ca) has not changed since last install.
  Use `force = TRUE` to force installation


You may have noticed that the package `utilr` is not an "official" package from `CRAN`, yes, I made this package myself, which contains many useful other packages and some my personal utility functions.

You may also need to put the `.Rdata` files in a `Rdata` folder (if not exits, create one), which is at the same directory as this `Jupyter` notebook, like this: ![image](img/directory.jpg)

Now you're reday to go!

## dataset: `user.stock`, `cube.info`, `cube.ret`, `cube.rb`

### `cube.info` (Portfolio information)

In [1]:
ld(cube.info)
cube.info[1:5]

' cube.info ' successfully loaded 
Use 1.82 secs 



cube.symbol,owner.id,market,create.date,close.date,fans.count
SP1000000,3340940262,cn,2016-07-28,,18
SP1000000,3340940262,cn,2016-08-04,2017-11-23,15
SP1000001,7163075843,cn,2016-08-04,2016-08-05,2
SP1000001,7163075843,cn,2016-08-04,2016-08-05,3
SP1000002,8851123831,cn,2016-06-27,,10


In [2]:
cube.info[, table(market)]

market
     cn      hk      us 
1658998   94891   80674 

- `cube.symbol`. The code (id) of each portfolio, as the identifier. The website refer to the portfolios as "cube" and I follow this naming convention, yet I don't know why. `cube.symbol` starting with `SP` means "real money" portfolios, and `ZP` for "virtual" portfolios. Again, I'm just following their naming convention.


- `owner.id`. Identifier of the owner (creator) of the portfolios. Each `owner.id` can only create *one* real portfolios, but are allowed to create multiple virtual portfolios.


- `market`. Which marekt does the portoflio trade in? `cn` for China, `hk` for Hongkong, and `us` for US


- `create.date`. When the portfolo is created.


- `close.date`. When it's closed. Some users may choose to close his portfolios.


- `fans.count`. How many users are following (not necessarily copying) this portfolio

### `cube.ret` (portfolio return)

In [4]:
ld(cube.ret)
cube.ret[1:5]

'cube.ret' already exists, will NOT load again!


Use 0 secs 



cube.symbol,date,value
SP1000000,2017-01-03,16.2671
SP1000000,2017-01-04,16.2671
SP1000000,2017-01-05,16.2671
SP1000000,2017-01-06,16.2671
SP1000000,2017-01-09,16.2671


- `cube.symbol`. See previous.


- `value`. Portfolio netvalue. Every portfolio starts with `value` of **1**.

### `cube.rb` (portfolio rebalance)

In [6]:
ld(cube.rb)
head(cube.rb)

' cube.rb ' successfully loaded 
Use 33.46 secs 



cube.symbol,created.at,stock.symbol,price,prev.weight.adjusted,target.weight
SP1000047,2017-03-20 09:25:00,SH600886,7.49,0.9,0.7
SP1000064,2017-03-20 13:00:03,SH600315,29.43,7.57,3.78
SP1000064,2017-03-20 13:00:03,SH600315,29.44,3.78,0.0
SP1000064,2017-09-12 10:06:10,SH600479,13.57,9.56,11.81
SP1000064,2017-09-12 10:06:13,SH600479,13.57,11.81,12.37
SP1000064,2017-03-20 09:46:48,SH600664,8.3,0.0,0.43


- `created.at`. When a trade is made.


- `stock.symbol`. Which stock is the trade involved. `SH` prefix is for Shanghai Exchange, and `SZ` is for Shenzhen Exchange.


- `price`. At which price is the trade made.


- `prev.weight.adjusted` and `target.weight`. For example, if these two variables are `0.90` and `0.70` (first line above), respectively, it means that before the trade, this stock (SH600886) accounts for 90% of the total portfolio weight, while it only accounts for 70% after the trade. Obviously, we can see this is a *SELL* trade.

### `user.stock` (user's stock watchlist)

This dataset gives when a stock (or portfolio) is added to a user's watchlist

In [9]:
ld(user.stock)
user.stock[1:5]

' user.stock ' successfully loaded 
Use 26.06 secs 



user.id,create.at,code,exchange
1000005225,2018-01-27 09:44:34,ZH030505,ZHHK
1000005225,2018-01-27 09:42:18,ZH1262123,ZHHK
1000005331,1970-01-01 00:00:00,.DJI,INDEXDJX
1000005331,2018-04-11 02:57:16,.DJI,INDEXDJX
1000005331,1970-01-01 00:00:00,.IXIC,INDEXNASDAQ


- `user.id`. See previous.


- `create.at`. When this stock is added to the watchlist.


- `code`. code of the stock or portfolio. Please note (1) user can add stock *as well as portfolio* in to the watchlist, (2) `code` and `stock.symbol` are the same variable with different names.


- `exchange`. Which exchange does the stock/portfolio belongs to. For example, `ZHHK` means this is a virtual portfolio (`ZH`), and it trades in Hongkong market.

Let me know if you have any questions. Happy learning!

Ross