Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge .browse.VC.db file #150

Closed
alexdima opened this issue Aug 16, 2016 · 14 comments
Closed

Huge .browse.VC.db file #150

alexdima opened this issue Aug 16, 2016 · 14 comments
Labels
Language Service more info needed The issue report is not actionable in its current state

Comments

@alexdima
Copy link
Member

Moved from microsoft/vscode#10557


From @bharathitman

  • VSCode Version: 1.4.0
  • OS Version: Windows 10

Steps to Reproduce:

  1. I was working on a simple angular 2 application for a few hours. When I was about to push the code I got an error saying there was this huge file .browse.VC.db (~720 MB in size). I had to add this to git ignore
  2. I think I do get the purpose of the file, but is it supposed to be this huge? or is the behavior strange?

From @AkashGutha

Had same problems but was 40Mb in size though ?
What is this file about ?

@jgoshi
Copy link
Member

jgoshi commented Aug 23, 2016

image2
image1

This file is the database of symbols in your files. Depending on the size of the project the database can become large. The file sizes you listed look normal. You can exclude it from git as you did. You can also control the location of the database (so it is outside your repo). See the two attached screenshots for more details. If you edit the cpp settings file you can find a databaseFilename setting. Use the full path (directory and file name) you want to use. If it's an empty string (or missing from the settings file) then it'll go to the default location.

@jgoshi jgoshi closed this as completed Aug 23, 2016
@tojocky
Copy link

tojocky commented May 5, 2017

mine is 20+ GB.

@sean-mcmanus
Copy link
Collaborator

@tojocky Wow, that seems too big. The largest I've seen is 1.4 GB for Chromium. Does changing some of the settings to reduce the size work for you? You can use files.exclude to remove directories and files that you don't care about having symbols for, and limitSymbolsToIncludedHeaders to true might help too, and setting addWorkspaceRootToIncludePath to false and then selectively adding the directories you actually want symbols for should help. You should also delete the database or change the databaseFilename after making these settings changes because the database doesn't self-clean and can accumulate junk from older settings (which we've been planning to fix for a while). This could also be a new bug due or due to symbolic link cycles, but we would need more info to tell.

@sean-mcmanus sean-mcmanus added Language Service more info needed The issue report is not actionable in its current state labels May 5, 2017
@tojocky
Copy link

tojocky commented Aug 14, 2017

OK, I'm back to ~20GB

@sean-mcmanus
Copy link
Collaborator

sean-mcmanus commented Aug 14, 2017

@tojocky Can you provide more info? Do you think this is a bug? You should be able to workaround the issue via deleting the database file (or changing databaseFilename) after reducing the scope of the browse.path setting to not include so many files. Our database adds all the filenames it recursively detects from browse.path and then parses files for symbol information for files it believes are C/C++. So it's either finding too many files and/or parsing too many files. You could possibly help us diagnose the issue via opening the .browse.vc.db file with a SQLite viewer and looking for what's causing the size bloat. It also doesn't remove files from the database that no longer exist in the browse.path, requiring a manual deletion to clean up (an issue we are planning to fix in September).

@tojocky
Copy link

tojocky commented Aug 15, 2017

This time I used sqlite3_analyzer.exe to understand what is going on.
Seems the table CODE_ITEMS with it indexes takes most of the space.

I ran the SQL command: "select count(*) from code_items;" and the result is: 110702994

/** Disk-Space Utilization Report For C:\Users\ion.lupascu\AppData\Roaming\Code\User\workspaceStorage\dc5891a1df997736f6106d3d0a76af58\ms-vscode.cpptools\.BROWSE.VC.DB

Page size in bytes................................ 4096      
Pages in the whole file (measured)................ 5213044   
Pages in the whole file (calculated).............. 5213043   
Pages that store data............................. 5213042    100.000% 
Pages on the freelist (per header)................ 1            0.0% 
Pages on the freelist (calculated)................ 2            0.0% 
Pages of auto-vacuum overhead..................... 0            0.0% 
Number of tables in the database.................. 15        
Number of indices................................. 37        
Number of defined indices......................... 30        
Number of implied indices......................... 7         
Size of the file in bytes......................... 21352628224
Bytes of user payload stored...................... 8825666906  41.3% 

*** Page counts for all tables with their indices *****************************

CODE_ITEMS........................................ 4808119     92.2% 
FILE_SIGNATURES................................... 163027       3.1% 
FILES............................................. 102832       2.0% 
ASSOC_TEXT........................................ 62247        1.2% 
ASSOC_SPANS....................................... 58479        1.1% 
BASE_CLASS_PARENTS................................ 18307        0.35% 
CONFIGS........................................... 5            0.0% 
FILE_MAP.......................................... 5            0.0% 
CONFIG_FILES...................................... 4            0.0% 
PROJECTS.......................................... 4            0.0% 
SQLITE_MASTER..................................... 4            0.0% 
SHARED_TEXT....................................... 3            0.0% 
CODE_ITEM_KINDS................................... 2            0.0% 
PARSERS........................................... 2            0.0% 
PROPERTIES........................................ 2            0.0% 

*** Page counts for all tables and indices separately *************************

CODE_ITEMS........................................ 2146534     41.2% 
IX_CODE_ITEMS_NAME................................ 558092      10.7% 
IX_CODE_ITEMS_PARENT_ID_KIND...................... 478439       9.2% 
SQLITE_AUTOINDEX_CODE_ITEMS_1..................... 428285       8.2% 
IX_CODE_ITEMS_PARENT_ID........................... 416587       8.0% 
IX_CODE_ITEMS_LOWER_NAME_HINT..................... 390802       7.5% 
IX_CODE_ITEMS_FILE_ID............................. 389380       7.5% 
FILE_SIGNATURES................................... 158169       3.0% 
FILES............................................. 53016        1.0% 
ASSOC_TEXT........................................ 44283        0.85% 
UQ_FILES_NAME..................................... 37322        0.72% 
ASSOC_SPANS....................................... 25118        0.48% 
UQ_ASSOC_SPANS_CODE_ITEM_ID_KIND.................. 17383        0.33% 
IX_ASSOC_SPANS_CODE_ITEM_ID....................... 15978        0.31% 
UQ_ASSOC_TEXT_CODE_ITEM_ID_KIND................... 9400         0.18% 
IX_FILES_LEAF_NAME................................ 8851         0.17% 
IX_ASSOC_TEXT_CODE_ITEM_ID........................ 8564         0.16% 
UQ_BASE_CLASS_PARENTS_BASE_CODE_ITEM_ID_PARENT_CODE_ITEM_ID 5577         0.11% 
BASE_CLASS_PARENTS................................ 4642         0.089% 
IX_BASE_CLASS_PARENTS_BASE_CODE_ITEM_ID........... 4044         0.078% 
IX_BASE_CLASS_PARENTS_PARENT_CODE_ITEM_ID......... 4044         0.078% 
SQLITE_AUTOINDEX_FILES_1.......................... 3643         0.070% 
UQ_FILE_SIGNATURES_FILE_ID_KIND................... 2533         0.049% 
IX_FILE_SIGNATURES_FILE_ID........................ 2325         0.045% 
SQLITE_MASTER..................................... 4            0.0% 
CODE_ITEM_KINDS................................... 1            0.0% 
CONFIG_FILES...................................... 1            0.0% 
CONFIGS........................................... 1            0.0% 
FILE_MAP.......................................... 1            0.0% 
IX_CONFIG_FILES_CONFIG_ID......................... 1            0.0% 
IX_CONFIG_FILES_FILE_ID........................... 1            0.0% 
IX_CONFIGS_NAME................................... 1            0.0% 
IX_CONFIGS_PROJECT_ID............................. 1            0.0% 
IX_FILE_MAP_CODE_ITEM_ID.......................... 1            0.0% 
IX_FILE_MAP_CONFIG_ID............................. 1            0.0% 
IX_FILE_MAP_FILE_ID............................... 1            0.0% 
IX_SHARED_TEXT_HASH............................... 1            0.0% 
PARSERS........................................... 1            0.0% 
PROJECTS.......................................... 1            0.0% 
PROPERTIES........................................ 1            0.0% 
SHARED_TEXT....................................... 1            0.0% 
SQLITE_AUTOINDEX_CONFIGS_1........................ 1            0.0% 
SQLITE_AUTOINDEX_PARSERS_1........................ 1            0.0% 
SQLITE_AUTOINDEX_PROJECTS_1....................... 1            0.0% 
SQLITE_AUTOINDEX_PROPERTIES_1..................... 1            0.0% 
SQLITE_AUTOINDEX_SHARED_TEXT_1.................... 1            0.0% 
UQ_CODE_ITEM_KINDS_NAME_PARSER_GUID............... 1            0.0% 
UQ_CONFIG_FILES_CONFIG_ID_FILE_ID................. 1            0.0% 
UQ_CONFIGS_PROJECT_ID_NAME........................ 1            0.0% 
UQ_FILE_MAP_CODE_ITEM_ID_CONFIG_ID_FILE_ID........ 1            0.0% 
UQ_PROJECTS_GUID.................................. 1            0.0% 
UQ_PROJECTS_NAME.................................. 1            0.0% 

I also ran the command "select * from code_items limit 50;" and I see things like:

"27"	"1"	"0"	"35"	"65538"	"iomanip"	""	"1"	"31"	"18"	"31"	"0"	"0"	"0"	"0"	""	"NULL"	"NULL"	"NULL"	"NULL"	"NULL"	"ioma"
"28"	"1"	"0"	"35"	"65538"	"math.h"	""	"1"	"32"	"17"	"32"	"0"	"0"	"0"	"0"	""	"NULL"	"NULL"	"NULL"	"NULL"	"NULL"	"math"
"29"	"1"	"0"	"35"	"65538"	"algorithm"	""	"1"	"33"	"20"	"33"	"0"	"0"	"0"	"0"	""	"NULL"	"NULL"	"NULL"	"NULL"	"NULL"	"algo"

except my hpp files.

also I checked how many times a file is repeated by running: "select count(*) from code_items where name="iomanip";": 2098

Question: is this the # of lines?

Let me know if you need more info.

@sean-mcmanus
Copy link
Collaborator

sean-mcmanus commented Aug 15, 2017

@tojocky Code items are symbols. It looks like your code base has lots of symbols. Do you believe this is expected or does it seem like a bug to you? If non-C/C++ files are being incorrectly parsed due to a file association mapping, that might cause too many symbols to be generated. You could try using files.exclude to remove sections of your code base, which should cause the symbol to be removed. Is the 20 GB database a problem for you? Is performance slow or is it just hogging disk space?

@tojocky
Copy link

tojocky commented Aug 15, 2017

Hi @sean-mcmanus . Regarding performance I can't complain, Thank you for the great job.
For C++ projects I wanted to use a modern IDE, but I'm fine with vim and sometime sublimetext.
This project is a really huge.

The only issue is just hogging disk space.

I will consider to use files.exclude setting.

@tojocky
Copy link

tojocky commented Aug 16, 2017

BTW, instead of encoding file name in each code item isn't better to isolate into a separate table with a primary key? It will avoid a filename to be repeated 1000s of times plus the index also takes a lot of space.

A NoSQL DB would be better.

This is just what I'm thinking.

@lxzh
Copy link

lxzh commented Apr 28, 2018

1.Ctrl+P
2.Open c_cpp.properties.json
3.Edit the follow node:

"browse": {
                "path": [
                    "${workspaceFolder}",
                    "D:/Program Files/VS2017/VC/Tools/MSVC/14.11.25503/include/*",
                    "D:/Program Files/VS2017/VC/Tools/MSVC/14.11.25503/atlmfc/include/*",
                    "C:/Program Files (x86)/Windows Kits/10/Include/10.0.15063.0/um",
                    "C:/Program Files (x86)/Windows Kits/10/Include/10.0.15063.0/ucrt",
                    "C:/Program Files (x86)/Windows Kits/10/Include/10.0.15063.0/shared",
                    "C:/Program Files (x86)/Windows Kits/10/Include/10.0.15063.0/winrt"
                ],
                "limitSymbolsToIncludedHeaders": true,
                "databaseFilename": "D:/Others/VSCode/browse.vc.db"
            },

4.Change "databaseFilename"value to location where you want to store the browse.vc.db file.

@sean-mcmanus
Copy link
Collaborator

@ljf1239848066 What's the problem? How big is your file?

@lxzh
Copy link

lxzh commented May 7, 2018

@sean-mcmanus More than 20G.

@sean-mcmanus
Copy link
Collaborator

@ljf1239848066 Is your workspace really big? How may files are getting discovered/parsed? If your loggingLevel it high enough it should show that info in the C/C++ Output window.

@lxzh
Copy link

lxzh commented May 7, 2018

I'm working on aosp project with different branches, so i need to open several instance at the same time, totally nearly a million files. My reply is to figure out a solution to changed the .db file out of C disk to avoid lacking of space.

@github-actions github-actions bot locked and limited conversation to collaborators Oct 19, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Language Service more info needed The issue report is not actionable in its current state
Projects
None yet
Development

No branches or pull requests

5 participants