Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

小程序体积优化(1)--优化大文本 #13

Open
some-code opened this issue Jun 14, 2018 · 0 comments
Open

小程序体积优化(1)--优化大文本 #13

some-code opened this issue Jun 14, 2018 · 0 comments
Labels

Comments

@some-code
Copy link
Owner

缘起

昨天接手了一个小程序,让新增一些页面。页面写完,预览失败。为啥?大小超过2M了。虽然说小程序目前支持分包的方式让上限提高到4M,但是考虑到业务的发展,还是先优化一波。

去掉无用数据

优化体积,从大文件下手,首先找到的大文件,就是 address.js, 体积颇大,足有 145kb 。我们看看他。

// 原始数据, 差不多长这样

module.exports = [{"code":"110000","region":"北京","regionEntitys":[{"code":"110100","region":"北京市","regionEntitys":[{"code":"110101","region":"东城区"},{"code":"110102","region":"西城区"},{"code":"110105","region":"朝阳区"},{"code":"110106","region":"丰台区"},{"code":"110107","region":"石景山区"},{"code":"110108","region":"海淀区"},{"code":"110109","region":"门头沟区"},{"code":"110111","region":"房山区"},{"code":"110112","region":"通州区"},{"code":"110113","region":"顺义区"},{"code":"110114","region":"昌平区"},{"code":"110115","region":"大兴区"},{"code":"110116","region":"怀柔区"},{"code":"110117","region":"平谷区"},{"code":"110118","region":"密云区"},{"code":"110119","region":"延庆区"},{"code":"110199","region":"其他区"}]}]},{"code":"120000","region":"天津","regionEntitys":[{"code":"120100","region":"天津市","regionEntitys":[{"code":"120101","region":"和平区"},{"code":"120102","region":"河东区"},{"code":"120103","region":"河西区"},{"code":"120104","region":"南开区"},{"code":"120105","region":"河北区"},{"code":"120106","region":"红桥区"},{"code":"120110","region":"东丽区"},{"code":"120111","region":"西青区"},{"code":"120112","region":"津南区"},{"code":"120113","region":"北辰区"},{"code":"120114","region":"武清区"},{"code":"120115","region":"宝坻区"},{"code":"120116","region":"滨海新区"},{"code":"120117","region":"宁河区"},{"code":"120118","region":"静海区"},{"code":"120119","region":"蓟州区"},{"code":"120199","region":"其他区"}]}]},{"code":"130000","region":"河北省","regionEntitys":[{"code":"130100","region":"石家庄市","regionEntitys":[{"code":"130102","region":"长安区"},{"code":"130104","region":"桥西区"},{"code":"130105","region":"新华区"},{"code":"130107","region":"井陉矿区"},{"code":"130108","region":"裕华区"},{"code":"130109","region":"藁城区"},{"code":"130110","region":"鹿泉区"},{"code":"130111","region":"栾城区"},{"code":"130121","region":"井陉县"},{"code":"130123","region":"正定县"},{"code":"130125","region":"行唐县"},{"code":"130126","region":"灵寿县"},{"code":"130127","region":"高邑县"},{"code":"130128","region":"深泽县"},{"code":"130129","region":"赞皇县"},{"code":"130130","region":"无极县"},{"code":"130131","region":"平山县"},{"code":"130132","region":"元氏县"},{"code":"130133","region":"赵县"},{"code":"130183","region":"晋州市"},{"code":"130184","region":"新乐市"},{"code":"130199","region":"其他区"}]},{"code":"130200","region":"唐山市","regionEntitys":[{"code":"130202","region":"路南区"},{"code":"130203","region":"路北区"},{"code":"130204","region":"古冶区"},{"code":"130205","region":"开平区"},{"code":"130207","region":"丰南区"},{"code":"130208","region":"丰润区"},{"code":"130209","region":"曹妃甸区"},{"code":"130223","region":"滦县"},{"code":"130224","region":"滦南县"},{"code":"130225","region":"乐亭县"},{"code":"130227","region":"迁西县"},{"code":"130229","region":"玉田县"},{"code":"130281","region":"遵化市"},{"code":"130283","region":"迁安市"},{"code":"130299","region":"其他区"}]},{"code":"130300","region":"秦皇岛市","regionEntitys":[{"code":"130302","region":"海港区"},{"code":"130303","region":"山海关区"},{"code":"130304","region":"北戴河区"},{"code":"130306","region":"抚宁区"}, ...], ... ]

通过调用的页面发现,数据中的 code 字段是没有被使用的,先全文替换为空字符串。

let str = JSON.stringify(data)
// 去除code字段
str = str.replace(/"code":"\d{6}",/g, '')

替换后,体积变为了 90KB, 直接减少了 38% 的体积

缩短变量名

去掉code字段之后,体积确实少了很多,但是还需要进一步优化,把长的变量名改短,看看能减少多少体积?

// regionEntitys 修改为 E
// region 修改为 R

str = str.replace(/regionEntitys/g, 'E')
str = str.replace(/region/g, 'R')

现在的体积是 68kb,仅仅是通过修改变量名,又减少了 24% 的体积。

数据字典

到了现在,还能减少吗?当然能,变量名可以缩短,汉字字符串可以提取相同的部分,作为数据字典。先统计一下那些字出现概率最高:

let hashMap = {};
for(let i = 0, len = str.length; i < len; i++){
  let char = str[i];
  if(['{', '}', '[', ']', ':', ',', '"', 'E', 'R'].indexOf(char) > -1) continue
  if(!hashMap[char]){
    hashMap[char] = 1
  }
  hashMap[char] += 1
}

let sortList = [];
for(var i in hashMap){
  sortList.push([i, hashMap[i]])
}

// sortList 前20个
[ [ '县', 1503 ],
  [ '区', 1305 ],
  [ '市', 667 ],
  [ '其', 341 ],
  [ '他', 341 ],
  [ '族', 198 ],
  [ '山', 172 ],
  [ '治', 161 ],
  [ '自', 160 ],
  [ '城', 157 ],
  [ '州', 147 ],
  [ '阳', 132 ],
  [ '江', 125 ],
  [ '安', 120 ],
  [ '南', 109 ],
  [ '东', 85 ],
  [ '平', 82 ],
  [ '宁', 80 ],
  [ '河', 78 ],
  [ '西', 74 ] ]

统计完毕之后,做一次全局的文本替换

let top20 = sortList.sort((a,b)=>{return b[1] - a[1]}).slice(0, 20).map(i=>i[0])
let keyMap = 'abcdefghijklmnopqrstuvwxyz';

const pat = new RegExp(`(${top20.join('|')})`, 'g')
// 替换字符串
str = str.replace(pat, (hit)=>{
  let index = top20.indexOf(hit);
  return keyMap.charAt(index);
})

压缩后的文本看起来是这样的

let region = [{"R":"北京","E":[{"R":"北京c","E":[{"R":"pjb"},{"R":"tjb"},{"R":"朝lb"},{"R":"丰台b"},{"R":"石景gb"},{"R":"海淀b"},{"R":"门头沟b"},{"R":"房gb"},{"R":"通kb"},{"R":"顺义b"},{"R":"昌qb"},{"R":"大兴b"},{"R":"怀柔b"},{"R":"q谷b"},{"R":"密云b"},{"R":"延庆b"},{"R":"deb"}]}]},{"R":"天津","E":[{"R":"天津c","E":[{"R":"和qb"},{"R":"spb"},{"R":"stb"},{"R":"o开b"},{"R":"s北b"},{"R":"红桥b"},{"R":"p丽b"}, ...]

替换后的文本是56kb,体积再次减少了 17%。

文本解析

在进行了字典压缩文本之后,使用时还需要解析,再次利用提取出的字段:

let pat = new RegExp(`(${Object.keys(top20).join('|')})`, 'g')
function getRegion() {
  let data = null;
  try {
    data = JSON.parse(JSON.stringify(region).replace(pat, (hit)=>{
      return top20[hit];
    }))
  } catch (error) {
    throw new Error(error);
  }
  return data
}

经过测试,解析耗时 5ms左右, 在可以承受的范围。

经过不懈努力,终于把这个文件从最初的145kb,减少了到现在的56kb,一共减少了61% 的文件大小。

可以看出,压缩虽然有效,但是收益最大的操作还是去掉无用的字段。顺着这个思路,接下来继续对图片进行优化。

example

本文demo在这里 小程序体积优化(1)--优化大文件 demo

@some-code some-code added the app label Jun 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant