Skip to content
This repository has been archived by the owner on Nov 26, 2022. It is now read-only.

用户数据过多peewee无法插入 #158

Closed
ThereChave opened this issue Mar 28, 2021 · 29 comments
Closed

用户数据过多peewee无法插入 #158

ThereChave opened this issue Mar 28, 2021 · 29 comments

Comments

@ThereChave
Copy link

生成了一批用户,好像大于999提示sqlite3.OperationalError: too many SQL variables

[ERROR]2021-03-28 02:10:09,142 default_exception_handler line:1703 Task exception was never retrieved
future: <Task finished name='Task-1' coro=<ProxyMan.start_and_check_ss_server() done, defined at /root/aioshadowsocks/shadowsocks/proxyman.py:95> exception=OperationalError('too many SQL variables')>
Traceback (most recent call last):
  File "/usr/local/python3/lib/python3.8/site-packages/peewee.py", line 3144, in execute_sql
    cursor.execute(sql, params or ())
sqlite3.OperationalError: too many SQL variables
@Ehco1996
Copy link
Owner

@ThereChave 我在本地造了1000+的用户,并没有出现你说的问题,

可以贴上api拉出的user列表么?

@ThereChave
Copy link
Author

ThereChave commented Mar 28, 2021

查到原因了,是因为sqlite3版本问题,http://docs.peewee-orm.com/en/latest/peewee/querying.html#bulk-inserts

SQLite users should be aware of some caveats when using bulk inserts. Specifically, your SQLite3 version must be 3.7.11.0 or newer to take advantage of the bulk insert API. Additionally, by default SQLite limits the number of bound variables in a SQL query to 999 for SQLite versions prior to 3.32.0 (2020-05-22) and 32766 for SQLite versions after 3.32.0.

服务器默认的sqlite3基本为3.7.*版本,尝试编译到3.32以上正常

@ThereChave
Copy link
Author

1000+以上服务器器CPU负载就直接满了是吧

@Ehco1996
Copy link
Owner

@ThereChave 要看是不是1000人同时在线了

具体可能得上机器看一下我才能知道,也没有条件做这方面的本地测试

@Ehco1996 Ehco1996 reopened this Mar 30, 2021
@ThereChave
Copy link
Author

@Ehco1996 好像就是刚开始加载的时候,启动的时候CPU到100%,启动后没有到CPU100了

@Ehco1996
Copy link
Owner

@ThereChave 那还挺正常的,一开始要批量写数据和算缓存

@ThereChave
Copy link
Author

哦哦,生成了1500个用户,观察了下,EC2服务器,启动服务后CPU直接到100,然后隔一段时间,应该是sync_time后再次到100,哪里无法优化的吗?

@Ehco1996
Copy link
Owner

想想办法还是可以优化的...
你用的是1000个人用一个端口么?

@ThereChave
Copy link
Author

是的

@Ehco1996
Copy link
Owner

最简单的办法是做一下sharding,100个人左右共用端口是最好的

@ThereChave
Copy link
Author

没有其他优化了吗?ssr测试同样的单端口CPU占用并不是很高?难道是因为密码来区分用户的原因吗?

@ThereChave
Copy link
Author

听说是sqlite3插入数据库的时候占用资源很高,不懂大佬为什么要用数据库来存?

@Ehco1996
Copy link
Owner

没有其他优化了吗?ssr测试同样的单端口CPU占用并不是很高?难道是因为密码来区分用户的原因吗?

的确是这个原因,ss的单端口和ssr的原理就不太一样 这种算力节省不了的

@ThereChave
Copy link
Author

好的

@ThereChave
Copy link
Author

感觉直接用列表存用户数据能好点吧?cpu大部分都在写数据库,而且model的那个not_in非常费cpu

@Ehco1996
Copy link
Owner

@ThereChave 你是怎么定位到not_in这个部分耗费cpu的?有火焰图之类的么?

@ThereChave
Copy link
Author

查看分析的

    def create_or_update_by_user_data_list(cls, user_data_list):
        user_ids = []
        for user_data in user_data_list:
            user_ids.append(user_data["user_id"])
            cls._create_or_update_user_from_data(user_data)
        cnt = cls.delete().where(cls.user_id.not_in(user_ids)).execute()
        if cnt:
            logging.info(f"delete out of traffic user cnt: {cnt}")

这个函数占用CPU最多,最多的在插入数据,主要一个一个插入,然后这个查询

@ThereChave
Copy link
Author

之前说的要升级sqlite3的主要是这个影响的, cnt = cls.delete().where(cls.user_id.not_in(user_ids)).execute()。user_id很容易就超过999,然后就超了

@Ehco1996
Copy link
Owner

Ehco1996 commented Apr 1, 2021

截屏2021-04-01 上午11 03 38

@ThereChave 我抓了一下火焰图..的确有这个问题,得想想办法

@ThereChave
Copy link
Author

ThereChave commented Apr 1, 2021

刚才我没生成出来火炬图,应该就是这个问题,CPU直接冲上100%,其他就卡住了

@Ehco1996
Copy link
Owner

Ehco1996 commented Apr 1, 2021

这里一次sync太多人的确会有问题,需要优化一下

Ehco1996 added a commit that referenced this issue Apr 9, 2021
Ehco1996 added a commit that referenced this issue Apr 9, 2021
Ehco1996 added a commit that referenced this issue Apr 9, 2021
Ehco1996 added a commit that referenced this issue Apr 9, 2021
@Ehco1996
Copy link
Owner

Ehco1996 commented Apr 9, 2021

@ThereChave 最新版大幅度优化了cpu的使用,可以尝试一下

@ThereChave
Copy link
Author

生成2900个用户CPU占用14左右。比上个版本好的非常多,但是3000个就会提示sync error: too many SQL variables,sqlite3已经升级到
sqlite3.sqlite_version
'3.35.3'
任然存在这个问题

@Ehco1996
Copy link
Owner

Ehco1996 commented Apr 9, 2021

@ThereChave 我在本地尝试了一下5000个用户,并没有报错,然后我得sqlite的版本是3.32.3

看上去和sqlite的版本没有关系?能贴一下你报错的时候的截图么

@ThereChave
Copy link
Author

info
debug

@ThereChave
Copy link
Author

卡到这里了,2900可以,3000不行

@Ehco1996
Copy link
Owner

@ThereChave 我尝试限制了bulk_create 的数量,可以再试一下?

另外感觉sqlite在不同系统数的最大限制不一致,我在本地mac上可以一口气批量创建5000+的用户

@Ehco1996 Ehco1996 reopened this Apr 10, 2021
@ThereChave
Copy link
Author

现在是似乎没有问题了

@Ehco1996
Copy link
Owner

那先close掉了,有问题欢迎再开

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants