TelechatSharp handles the deserialization of JSON exported from Telegram Desktop, making it easy to work with your chat data in .NET applications. Useful for text extraction and data analysis.
TelechatSharp is available on NuGet.
dotnet add package TelechatSharp
Construct a new Chat
object using the file path to your JSON:
Chat chat = new Chat("telegram.json");
Easily extract meaningful data from your chat history:
IEnumerable<Message> messages = chat.Messages;
IEnumerable<Message> messagesFromKyle = messages.FromMember("Kyle Sarte");
IEnumerable<Message> messagesContainingLol = messages.ContainingText("lol");
IEnumerable<string> allEmailsSentInChat = messages.GetAllTextsOfTextEntityType("email");
IEnumerable<string> allPhoneNumbersSentInChat = messages.GetAllTextsOfTextEntityType("phone");
For advanced data analysis use cases, combine TelechatSharp
with libraries such as Microsoft.Data.Analysis
:
using TelechatSharp.Core;
using Microsoft.Data.Analysis;
Chat chat = new Chat("telegram.json");
// From TelechatSharp.Core, get chat member names and their messages.
IEnumerable<string> from = chat.Messages.Select(m => m.From);
IEnumerable<string> text = chat.Messages.Select(m => m.Text);
// From Microsoft.Data.Analysis, create a new DataFrame using data from TelechatSharp.
DataFrameColumn[] columns = {
new StringDataFrameColumn("From", from),
new StringDataFrameColumn("Text", text)
};
DataFrame dataFrame = new(columns);
Code produces a DataFrame similar to the following:
From | Text |
---|---|
Céline | You are gonna miss that plane. |
Jesse | I know. |
A Chat
object can also be instantiated using a StreamReader
:
public Chat(StreamReader streamReader)
For readability, JSON samples in this README will omit most properties and only include those relevant to the topic being discussed.
While the library is most useful through a Chat
object's Messages
property, some derived data about a chat can be accessed via extension methods:
var dateCreated = chat.GetDateCreated();
var members = chat.GetMembers();
var originalMembers = chat.GetOriginalMembers();
Data about individual messages can be accessed through properties of the custom Message
object:
{
"messages": [
{
"type": "message",
"date": "2024-01-27T19:40:00",
"from": "Kyle Sarte",
"text": "TelechatSharp is public!"
}
]
}
foreach (Message message in chat.Messages)
{
Console.WriteLine($"On {message.Date}, {message.From} said '{message.Text}'");
}
Output is:
On 1/27/2024 7:40:00 PM, Kyle Sarte said 'TelechatSharp is public!'
In the schema, Telegram messages are either Message
types or Service
types. A Message
is any basic text message. A Service
is any action performed on the chat such as the pinning of a message or the invitation of a new member:
{
"messages": [
{
"id": 1,
"type": "message",
"text": "Someone please change the group photo."
},
{
"id": 2,
"type": "service",
"action": "edit_group_photo"
}
]
}
Messages can be filtered by type through extension methods:
var messageTypeMessages = chat.Messages.GetMessageTypeMessages();
var serviceTypeMessages = chat.Messages.GetServiceTypeMessages();
For a full list of available extension methods on Message
collections, refer to MessagesExtensions.cs
.
Full API documentation coming soon.
Message.Text
and Message.TextEntities
can be used to work with text data.
Use Message.Text
to get a plain text string of a message's text content.
The text
property of a message in the JSON is polymorphic—text
can either be a plain text string or an array of plain text strings and text entity objects. Message.Text
is backed by a private field of custom type Text
which handles any necessary string building to abstract these text entities away and simplify text retrieval.
{
"messages": [
{
"type": "message",
"text": "This is a message with only plain text."
},
{
"type": "message",
"text": [
"This is a message with plain text and ",
{
"type": "bold",
"text": "bold text."
},
" The bold text appears in the JSON as a text entity. Links, such as ",
{
"type": "link",
"text": "https://www.kylejsarte.com"
},
" will also appear as text entities."
]
}
]
}
A call to Message.Text
for the second message would build and return the following plain text string:
This is a message with plain text and bold text. The bold text appears in the JSON as a text entity. Links such as https://www.kylejsarte.com will also appear as text entities.
Use Message.TextEntities
for finer-grain access to text data.
Message.TextEntities
, unlike Message.Text
, preserves the structure of objects returned from the text_entities
property of messages in the JSON. Message.TextEntities
returns a collection of custom TextEntity
types with Type
and Text
properties:
{
"messages": [
{
"id": 1,
"type": "message",
"text_entities": [
{
"type": "link",
"text": "https://www.kylejsarte.com"
},
{
"type": "mention",
"text": "Kyle Sarte"
}
]
}
]
}
// Get the message with ID "1".
var message = chat.Messages.Where(m => m.Id == 1).FirstOrDefault();
foreach (TextEntity textEntity in message.TextEntities)
{
Console.WriteLine($"Type: {textEntity.Type}, Text: {textEntity.Text}");
}
Output is:
Type: link, Text: https://www.kylejsarte.com
Type: mention, Text: Kyle Sarte