We do respect our users' privacy and want to clarify that:
- We do NOT store private code on our server.
- We do NOT store any personal information from users such as location, system info, usage stats, coding preferences, etc.
With regards to the above-defined statements, to improve our Type4Py model and conduct research, we collect two kinds of telemetry data which are described below.
NOTE: If Type4Py's local model is used, no telemetry data is sent to any external server.
- Hashed IP addresses: Helps to uniquely identify prediction requests and active users. Note that a hashed IP cannot be decoded and makes users anonymous. E.g.
8a0872388f0f1...
- Activation ID: A random unique string that is generated upon the installation of the extension. It helps us to identify users at the same organization while keeping them anonymous. E.g.
31ea3e5...
- Session ID: A random unique string that is generated by our server. It helps to relate prediction requests to the accepted types if shared. E.g.
OTDw5LGgL1BE...
- File hash: The hash of files' absolute path, which allows us to identify prediction requests for different Python files. E.g.
8dc18307...
- Start and finish time for prediction requests which help us measure the performance of the Type4Py model and its pipeline for future improvements. E.g.
2021-07-14 15:50:58
- Errors/Exceptions that occur at the server-side. It helps to solve issues related to our pipeline and deliver a better user experience. E.g.
Syntax or parse errors
- Extracted features: This is a JSON object containing type hints that are used for querying the Type4Py model. Note that the JSON object does NOT contain complete source code that can be run or re-used. The extracted features are stored solely for research and improving the model's prediction quality. See a test JSON file here as an example.
- Extension version: The version of the extension a user has installed. It allows us to ignore certain records after/before a certain version. E.g.
0.1.3
NOTE: We gather the following data if the VSCode telemetry is enabled. If not, we explicitly ask users whether they want to share the below data.
- Accepted type: Stores the accepted predicted type by the user among the list of predictions. This helps us improve our Type4Py model's predictions. E.g.
pathlib.Path
- Rank: The rank of the accepted type by the user. E.g.
3
- Type slot: The accepted type belongs to one of these:
Parameter
,ReturnType
, orVariable
. - Identifier names for which a predicted type is accepted. In the future, they might be used to improve the model's predictions based on accepted types. E.g.
path_name
- Identifiers' line number: It can be used to locate identifiers in the JSON object of extracted features. E.g.
17
- Canceled prediction: Whether type predictions for a particular type slot are canceled or rejected by the user. It aids us to identify Type4Py's incorrect predictions for research. E.g.,
FALSE
- Filtered predictions: Whether the predictions filtering setting is enabled. E.g.
TRUE
- Timestamp: The date of inserted records. E.g.
2021-07-14 15:51:04