In [1]:
import torch

print(torch.__version__)

1.0.0


Tensor实际上是指针类型，指向TensorImpl.

伪代码如下:
```
class Tensor {
    protected:
        shared_ptr<TensorImpl> impl_;
    public:
    IntList sizes() const {
        return impl_->sizes();
    }
    int64_t dim() const {
        return impl_->dim();
    }
    IntList strides() const {
        return impl_->strides();
    }
    int64_t storage_offset() const {
        return impl_->storage_offset();
    }
    bool is_contiguous() const {
        return impl_->is_contiguous();
    }
    Type & type() const {
        return legacyTensorType(*impl_);
    }
    const Storage& storage() const {
        return impl_->storage();
    }
    
    template<typename T> T * data() const;
    Tensor& grad() {
        return impl_->grad();
    }
    ...
};
```

Tensor源码在aten/src/ATen目录下.

继续看TensorImpl这个类如何定义的

Tensorimpl包含真正的数据类对象Storage以及管理storage的数据(比如sizes_, strides_, offset_, numel_等)

```
struct TensorImpl {
    public:
        Storage storage_;  // 数据对象
    protected:
        at::SmallVector<int64_t,5> sizes_;
        at::SmallVector<int64_t,5> strides_;
        int64_t storage_offset_ = 0;
        int64_t numel_ = 1;
        caffe2::TypeMeta data_type_;

    public:
        virtual IntList sizes() const;
        virtual IntList strides() const;
        virtual int64_t dim() const;
        virtual const Storage& storage() const; 
        virtual int64_t storage_offset() const { 
            return storage_offset_;
        } 
        virtual int64_t numel() const {
                return numel_;
        }
       virtual bool is_contiguous() const {
            return is_contiguous_; 
        }
        inline void* data() const {
            return static_cast<void*>(
                static_cast<char*>(storage_.data()) +
                data_type_.itemsize() * storage_offset_);
        }
        size_t itemsize() const {
            AT_ASSERT(dtype_initialized());
            return data_type_.itemsize();
        }
        …
};
```

我们继续探究Storage的类定义, Storage实际上也是指针类型，指向底层数据类StorageImpl

```
struct Storage {
    protected:
        shared_ptr<StorageImpl> storage_impl_; 
    public:
        template <typename T> T* data() const {
            return storage_impl_->data<T>(); 
        }
        ptrdiff_t size() const {    
            return storage_impl_->numel();
        }
        int64_t numel() const {
            return storage_impl_->numel();
        }
        void* data() const {
            return storage_impl_.get()->data();
        }
        …
};
```

StorageImpl类定义 
```
struct StorageImpl {
    private:
        caffe2::TypeMeta data_type_; 
        DataPtr data_ptr_;       // 数据指针
        int64_t numel_; 
        bool resizable_; 
        Allocator* allocator_;  // 内存分配器
    public:     
    template <typename T> inline T* data() const {
        auto data_type_T = at::scalarTypeToDataType(c10::CTypeToScalarType<T>::to()); 
        return unsafe_data<T>(); 
    } 
    int64_t numel() const { 
        return numel_; 
    }; 
    at::DataPtr& data_ptr() { 
        return data_ptr_; 
    }; 
    void* data() const { 
        return data_ptr_.get(); 
    } 
    const at::Allocator* allocator() const { 
        return allocator_; 
    }
    ...
};
```

可以看到，不论Tensor的维度是多少，数据本质上就是一块内存区域(未必是一块连续内存)

那么Tensor是如何管理这块内存区域呢？如何检索数据？data[0][1][2]如何检索数据的?

接着看

In [2]:
data = torch.arange(24, dtype=torch.float32).view(2, 3, 4)

In [3]:
data

tensor([[[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]],

        [[12., 13., 14., 15.],
         [16., 17., 18., 19.],
         [20., 21., 22., 23.]]])

In [4]:
# 维度个数
print(data.ndimension())
data.dim()

3


3

In [5]:
# size
data.size()

torch.Size([2, 3, 4])

In [11]:
# strides

data.stride()

(12, 4, 1)

![](https://github.com/basicv8vc/pytorch-tutorials/blob/master/imgs/storage.png?raw=true)

对数据进行检索

![](https://github.com/basicv8vc/pytorch-tutorials/blob/master/imgs/how_to_index.png?raw=true)

In [19]:
# 对Tensor进行检索
data[1][2][2] 

tensor(22.)

data[1][2][2]为啥是22 ?


由于底层数据是一维数组, 在进行检索时, 首先要做的就是确定下标值, 如何根据data[i][j][k]的i、j、k计算下标值呢？

data.stride()记录了每个维度的元素个数: 

In [17]:
data[0], data[0].numel(), data.stride(0)

(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]]), 12, 12)

In [18]:
data[0][0], data[0][0].numel(), data.stride(1)

(tensor([0., 1., 2., 3.]), 4, 4)

根据data.stride()，就可以计算下标值了

![](https://github.com/basicv8vc/pytorch-tutorials/blob/master/imgs/compute_index.png?raw=true)


大多数情况下Tensor的数据都在一块连续内存中，但也可能有例外:


In [21]:
x = torch.Tensor().set_(data.storage(), storage_offset=3, size=torch.Size([2,3]), stride=(5,2))
x

tensor([[ 3.,  5.,  7.],
        [ 8., 10., 12.]])