Embedding space plotter

Path	pimlico.modules.visualization.embeddings_plot
Executable	yes

Plot vectors from embeddings, trained by some other module, in a 2D space using a MDS reduction and Matplotlib.

They might, for example, come from pimlico.modules.embeddings.word2vec. The embeddings are read in using Pimlico's generic word embedding storage type.

Uses scikit-learn to perform the MDS/TSNE reduction.

Inputs

Name	Type(s)
vectors	:class:list <pimlico.datatypes.base.MultipleInputs> of `Embeddings <pimlico.datatypes.embeddings.Embeddings>`

Outputs

Name	Type(s)
plot	`~pimlico.datatypes.plotting.PlotOutput`

Options

Name	Description	Type
skip	Number of most frequent words to skip, taking the next most frequent after these. Default: 0	int
metric	Distance metric to use. Choose from 'cosine', 'euclidean', 'manhattan'. Default: 'cosine'	'cosine', 'euclidean' or 'manhattan'
reduction	Dimensionality reduction technique to use to project to 2D. Available: mds (Multi-dimensional Scaling), tsne (t-distributed Stochastic Neighbor Embedding). Default: mds	'mds' or 'tsne'
colors	List of colours to use for different embedding sets. Should be a list of matplotlib colour strings, one for each embedding set given in input_vectors	comma-separated list of strings
cmap	Mapping from word prefixes to matplotlib plotting colours. Every word beginning with the given prefix has the prefix removed and is plotted in the corresponding colour. Specify as a JSON dictionary mapping prefix strings to colour strings	JSON string
words	Number of most frequent words to plot. Default: 50	int

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pimlico.modules.visualization.embeddings_plot.rst

pimlico.modules.visualization.embeddings_plot.rst

Embedding space plotter

Inputs

Outputs

Options

Files

pimlico.modules.visualization.embeddings_plot.rst

Latest commit

History

pimlico.modules.visualization.embeddings_plot.rst

File metadata and controls

Embedding space plotter

Inputs

Outputs

Options