# 备注： 在推荐系统中，重点是构建用户-物品评分矩阵，即基于用户对物品的偏好信息构建出用户对于物品的评分值

## 在构建评分矩阵的时候，需要根据业务、数据、数据特征来选择一个比较适合的方式

# 音乐文件推荐

## 一、需求
1. 每天为每个用户产生30首歌曲的一个推荐列表，要求这三十首歌中不能出现该用户最近7天听过/浏览过的歌曲
2. 当用户浏览某个歌单页面，给该用户推荐和当前歌单比较类似的5个其它歌单

## 二、推荐系统实现流程
1. 获取数据
2. 构建原始的用户物品评分矩阵
3. 基于推荐算法构建完整的用户物品评分矩阵
4. 将推荐结果保存到数据库中供其它项目组使用

## 三、数据获取

实际工作中，因为实现的推荐系统其实就是自己公司内部使用的，所以可以直接从公司的数据库中获取相关数据即可；一般情况下，常用的数据集包括但不限于：_**用户行为日志数据、物品相关的信息数据、用户相关的信息数据等**_<br/>
备注：在这里使用一种取巧的方式，使用爬取过来的歌单信息来作为原始数据

##### 歌单数据格式
```
{
  "result": {
    "coverImgUrl": "http://p1.music.126.net/Fo3s0Fx4rpgWIqx3ocx5TA==/1404076361831880.jpg",
    "ordered": true,
    "anonimous": false,
    "creator": {
      "followed": false,
      "remarkName": null,
      "expertTags": [
        "电子",
        "欧美"
      ],
      "userId": 42689902,
      "authority": 0,
      "userType": 0,
      "gender": 1,
      "backgroundImgId": 19075427230400572,
      "city": 210200,
      "mutual": false,
      "avatarUrl": "http://p1.music.126.net/0_h9kGGE6nPlmiLqY0Y5pw==/18567452860149155.jpg",
      "avatarImgIdStr": "18567452860149155",
      "detailDescription": "",
      "province": 210000,
      "description": "",
      "avatarImgId_str": "18567452860149155",
      "birthday": 799862400000,
      "nickname": "特洛伊-希文",
      "vipType": 0,
      "avatarImgId": 18567452860149156,
      "defaultAvatar": false,
      "djStatus": 10,
      "accountStatus": 0,
      "backgroundImgIdStr": "19075427230400573",
      "backgroundUrl": "http://p1.music.126.net/viGnRRajCOseYmHZ9vaePg==/19075427230400573.jpg",
      "signature": "即将毕业，经常不在线，偏爱清新电音和小众音乐，没有最爱，只听自己喜欢的",
      "authStatus": 0
    },
    "trackUpdateTime": 1491795359740,
    "userId": 42689902,
    "updateTime": 1469089663541,
    "commentCount": 621,
    "artists": null,
    "newImported": false,
    "commentThreadId": "A_PL_0_423245641",
    "subscribed": false,
    "privacy": 0,
    "id": 423245641,
    "trackCount": 30,
    "specialType": 0,
    "status": 0,
    "description": "Let's enjoy the sunshine afternoon with wonderful boys' voice！！！\n【让我们一起享受阳光午后的清新男声吧！！！】\n\n\n☀☀☀☀☀ 清新夏日第三单之【清新男声控】☀☀☀☀☀\n\n☀此单专注于清新的男声，这样的男声适合清新夏日的阳光午后。\n\n☀曲风为另类/独立的电子,也就是大家一起逛类似于H&M的潮店所听到的音乐。\n\n☀整张歌单选曲30首，其中轻快小调和舒缓小调穿插排序。\n\n☀希望此单带给大家清新夏日午后的男声愉悦听觉盛宴。\n\n☀让我们一起沉醉于男声，享受清新夏日的阳光午后吧！！！\n\n☀☀☀☀☀☀☀☀☀☀我的第四个歌单☀☀☀☀☀☀☀☀☀☀\n",
    "subscribedCount": 11355,
    "tags": [
      "欧美",
      "电子",
      "另类/独立"
    ],
    "coverImgId": 1404076361831880,
    "tracks": [],
    "highQuality": false,
    "subscribers": [],
    "playCount": 705665,
    "trackNumberUpdateTime": 1469089648033,
    "createTime": 1468670146885,
    "name": "☀清新夏日☀清新男声控|我想漂浮感受磁场",
    "cloudTrackCount": 0,
    "shareCount": 214,
    "adType": 0,
    "totalDuration": 0
  }
}
```

##### 歌单中所包含的歌曲信息, 即歌单中tracks对应的内容

```
{
  "bMusic": {
    "name": null,
    "extension": "mp3",
    "volumeDelta": -0.000265076,
    "sr": 44100,
    "dfsId": 1384285150446128,
    "playTime": 206146,
    "bitrate": 96000,
    "id": 1209309359,
    "size": 2473944
  },
  "hearTime": 0,
  "mvid": 0,
  "hMusic": {
    "name": null,
    "extension": "mp3",
    "volumeDelta": -0.000265076,
    "sr": 44100,
    "dfsId": 1384285150446125,
    "playTime": 206146,
    "bitrate": 320000,
    "id": 1209309356,
    "size": 8246377
  },
  "disc": "0",
  "artists": [
    {
      "img1v1Url": "http://p1.music.126.net/6y-UleORITEDbvrOLV0Q8A==/5639395138885805.jpg",
      "name": "Superwalkers",
      "briefDesc": "",
      "albumSize": 0,
      "img1v1Id": 0,
      "musicSize": 0,
      "alias": [],
      "picId": 0,
      "picUrl": "http://p1.music.126.net/6y-UleORITEDbvrOLV0Q8A==/5639395138885805.jpg",
      "trans": "",
      "id": 12048281
    }
  ],
  "duration": 206146,
  "id": 414691355,
  "album": {
    "status": 0,
    "blurPicUrl": "http://p1.music.126.net/F2xzSv77C_Xb3tof9ZmbjQ==/1380986614331927.jpg",
    "copyrightId": 0,
    "name": "Lost (As I Am)",
    "companyId": 0,
    "description": "",
    "pic": 1380986614331927,
    "commentThreadId": "R_AL_3_34700743",
    "publishTime": 1463673600007,
    "briefDesc": "",
    "company": "Cosmos Music",
    "picId": 1380986614331927,
    "alias": [],
    "picUrl": "http://p1.music.126.net/F2xzSv77C_Xb3tof9ZmbjQ==/1380986614331927.jpg",
    "artists": [
      {
        "img1v1Url": "http://p1.music.126.net/6y-UleORITEDbvrOLV0Q8A==/5639395138885805.jpg",
        "name": "Superwalkers",
        "briefDesc": "",
        "albumSize": 0,
        "img1v1Id": 0,
        "musicSize": 0,
        "alias": [],
        "picId": 0,
        "picUrl": "http://p1.music.126.net/6y-UleORITEDbvrOLV0Q8A==/5639395138885805.jpg",
        "trans": "",
        "id": 12048281
      }
    ],
    "songs": [],
    "artist": {
      "img1v1Url": "http://p1.music.126.net/6y-UleORITEDbvrOLV0Q8A==/5639395138885805.jpg",
      "name": "",
      "briefDesc": "",
      "albumSize": 0,
      "img1v1Id": 0,
      "musicSize": 0,
      "alias": [],
      "picId": 0,
      "picUrl": "http://p1.music.126.net/6y-UleORITEDbvrOLV0Q8A==/5639395138885805.jpg",
      "trans": "",
      "id": 0
    },
    "type": "EP/Single",
    "id": 34700743,
    "tags": "",
    "size": 0
  },
  "fee": 0,
  "no": 1,
  "rtUrl": null,
  "ringtone": null,
  "rtUrls": [],
  "score": 80,
  "rurl": null,
  "status": 0,
  "ftype": 0,
  "mp3Url": "http://m2.music.126.net/bHxh6kDAI_6FrdLrP6QewQ==/1384285150446128.mp3",
  "audition": null,
  "playedNum": 0,
  "commentThreadId": "R_SO_4_414691355",
  "mMusic": {
    "name": null,
    "extension": "mp3",
    "volumeDelta": -0.000265076,
    "sr": 44100,
    "dfsId": 1384285150446127,
    "playTime": 206146,
    "bitrate": 160000,
    "id": 1209309358,
    "size": 4123211
  },
  "lMusic": {
    "name": null,
    "extension": "mp3",
    "volumeDelta": -0.000265076,
    "sr": 44100,
    "dfsId": 1384285150446128,
    "playTime": 206146,
    "bitrate": 96000,
    "id": 1209309359,
    "size": 2473944
  },
  "copyrightId": 0,
  "name": "Lost (As I Am)",
  "rtype": 0,
  "crbt": null,
  "popularity": 80,
  "dayPlays": 0,
  "alias": [],
  "copyFrom": "",
  "position": 1,
  "starred": false,
  "starredNum": 0
}
```

## 四、评分矩阵的构建
1. 原始数据解析
2. 评分矩阵构建

### 原始数据解析
使用基础的音乐信息，认为：同一个歌单中的歌曲，具有比较高的相似性，同时这些歌曲都是受创建这个歌单的用户喜好的。<br/>
抽取 _**用户id，歌单id，歌单名称，最近的更新时间，订阅数，播放数**_ 六个维度的歌单信息<br/>
抽取 _**歌曲id，歌曲名称，歌曲热度**_三个维度的信息

```
42689902##423245641##☀清新夏日☀清新男声控|我想漂浮感受磁场##1469089663541##11355##705665	414691355::::Lost (As I Am)::::80.0	410802620::::Next Escape::::100.0	419549837::::Silhouette::::60.0	419485281::::Feel My Love::::45.0	412016420::::Hit It::::65.0	421160284::::Catch U::::85.0	420513422::::Breathe It In::::30.0	420500507::::Tropical Suneo::::80.0	421203274::::New Age::::55.0	416933311::::Moments::::25.0	33340138::::Hideaway::::25.0	421423368::::Do It Right::::100.0	30870137::::Somebody Like You::::50.0	407838716::::When We Were Young::::25.0	417908273::::We Got U::::65.0	418316404::::pink skies::::100.0	414670117::::Waste Away::::65.0	418602540::::Happy (Extended Mix)::::85.0	420500511::::Lost::::80.0	412268350::::Easy Lover::::65.0	34899626::::Found Your Love::::100.0	416531764::::Bubblegum::::85.0	29483200::::Hold Me::::95.0	419238656::::Too Much::::40.0	37240741::::Fever::::100.0	38846209::::ILYSB::::90.0	36921820::::Sing::::45.0	418654949::::Straight On Till Morning::::25.0	421203736::::Get up Everybody! (Viva La Vida)::::45.0	418603101::::The Best Crew::::85.0

```

### 评分矩阵构建
_**备注：**_一般情况下，根据业务以及数据特征来构建这个评分矩阵<br/>
使用歌曲的热度作为评分，如果订阅次数超过1000次并且播放次数超过1万次，同时最近修改时间在一年以内的，增加一个权重1.1；否则设置权重为0.9；并且将最终的评分缩放到[1,10]之间
也就是构建一个歌单id-歌曲id的评分矩阵