Skip to content

SparseArray

cheyiliu edited this page Jan 22, 2015 · 2 revisions

android SparseArray 源码分析



/*
 * Copyright (C) 2006 The Android Open Source Project
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *      http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package android.util;

import com.android.internal.util.ArrayUtils;
import com.android.internal.util.GrowingArrayUtils;

import libcore.util.EmptyArray;

/**
 * SparseArrays map integers to Objects.  Unlike a normal array of Objects,
 * there can be gaps in the indices.  It is intended to be more memory efficient
 * than using a HashMap to map Integers to Objects, both because it avoids
 * auto-boxing keys and its data structure doesn't rely on an extra entry object
 * for each mapping.
 * 
 * SparseArray用于映射integers到object。但不像普通数组那样,sparseArray的元素间没有无用元素。
 * 在映射integers到object的过程中,SparseArray由于采用避免自动装箱的keys和它的数据结构不依赖额外
 * 的对象来存储映射关系的实现,因此它比hashMap的内存使用更高效一些。
 * 
 * 
 * <p>Note that this container keeps its mappings in an array data structure,
 * using a binary search to find keys.  The implementation is not intended to be appropriate for
 * data structures
 * that may contain large numbers of items.  It is generally slower than a traditional
 * HashMap, since lookups require a binary search and adds and removes require inserting
 * and deleting entries in the array.  For containers holding up to hundreds of items,
 * the performance difference is not significant, less than 50%.</p>
 * 
 * 注意:SparseArray在查找keys的过程中采用了二分查找, 这种实现不适合数据量大的情况。由于查找时要用到二分查找,
 * 添加删除时涉及到数组其他元素的挪动,因此通常SparseArray会比hashMap慢。当处理上百的数据量,这种性能差异不是特别
 * 明显,性能差异不超过50%。
 * 
 *
 * <p>To help with performance, the container includes an optimization when removing
 * keys: instead of compacting its array immediately, it leaves the removed entry marked
 * as deleted.  The entry can then be re-used for the same key, or compacted later in
 * a single garbage collection step of all removed entries.  This garbage collection will
 * need to be performed at any time the array needs to be grown or the the map size or
 * entry values are retrieved.</p>
 * 
 * 为了优化性能,SparseArray针对remove case作了优化,remove时它不是立即挤压数组空间,而是标记为delete。
 * 这个被标记的元素要么被重复利用,要么在多次remove之后通过一次gc操作中被挤压出去。
 * gc需要在下列情况之前被执行:数组要扩容;get map size;get values;
 * 
 *
 * <p>It is possible to iterate over the items in this container using
 * {@link #keyAt(int)} and {@link #valueAt(int)}. Iterating over the keys using
 * <code>keyAt(int)</code> with ascending values of the index will return the
 * keys in ascending order, or the values corresponding to the keys in ascending
 * order in the case of <code>valueAt(int)</code>.</p>
 * 
 * 可以用keyAt valueAt实现遍历
 */
public class SparseArray<E> implements Cloneable {
    //巧妙的flag
    //当有元素被remove delete时,先暂时设置该对象为DELETED,并置mGarbage=true
    //用以提升性能,体现在哪里?问题1
    private static final Object DELETED = new Object();
    private boolean mGarbage = false;

    //存储的数据结构,两个数组加一个size
    //特殊在哪里?或者怎么用的呢?问题2
    private int[] mKeys;
    private Object[] mValues;
    private int mSize;

    /**
     * Creates a new SparseArray containing no mappings.
     */
    public SparseArray() {
        this(10);//默认容量是10个元素
    }

    /**
     * Creates a new SparseArray containing no mappings that will not
     * require any additional memory allocation to store the specified
     * number of mappings.  If you supply an initial capacity of 0, the
     * sparse array will be initialized with a light-weight representation
     * not requiring any additional array allocations.
     */
    public SparseArray(int initialCapacity) {
        if (initialCapacity == 0) {
            //看EmptyArray的实现便知,mKeys的初值等于new int[0], 其他同理
            mKeys = EmptyArray.INT;
            mValues = EmptyArray.OBJECT;
        } else {
            //newUnpaddedObjectArray有啥特性?问题3
            mValues = ArrayUtils.newUnpaddedObjectArray(initialCapacity);
            mKeys = new int[mValues.length];
        }
        mSize = 0;
    }

    @Override
    @SuppressWarnings("unchecked")
    public SparseArray<E> clone() {
        SparseArray<E> clone = null;
        try {
            //java深拷贝
            clone = (SparseArray<E>) super.clone();
            clone.mKeys = mKeys.clone();
            clone.mValues = mValues.clone();
        } catch (CloneNotSupportedException cnse) {
            /* ignore */
        }
        return clone;
    }

    /**
     * Gets the Object mapped from the specified key, or <code>null</code>
     * if no such mapping has been made.
     */
    public E get(int key) {
        return get(key, null);
    }

    /**
     * Gets the Object mapped from the specified key, or the specified Object
     * if no such mapping has been made.
     */
    @SuppressWarnings("unchecked")
    public E get(int key, E valueIfKeyNotFound) {
        //二分查找, 返回值含义是什么?问题4
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

        if (i < 0 || mValues[i] == DELETED) {
            // 没找到对应的key,或者找到了Key,但该元素被标记为delete
            // 则返回默认值
            return valueIfKeyNotFound;
        } else {
            // i>0 且该位置的元素未被标记为待删除,返回该值
            return (E) mValues[i];
        }
    }

    /**
     * Removes the mapping from the specified key, if there was any.
     */
    public void delete(int key) {
        //二分查找
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

        //若找到了
        if (i >= 0) {
            //若之前未被标记delete
            if (mValues[i] != DELETED) {
                //标记为delete,garbage=true
                mValues[i] = DELETED;
                mGarbage = true;
            }
        }
    }

    /**
     * Alias for {@link #delete(int)}.
     */
    public void remove(int key) {
        delete(key);
    }

    /**
     * Removes the mapping at the specified index.
     */
    public void removeAt(int index) {
        //移除特定位置的元素,注意传入的是mValues的index不是Key
        if (mValues[index] != DELETED) {
            mValues[index] = DELETED;
            mGarbage = true;
        }
    }

    /**
     * Remove a range of mappings as a batch.
     *
     * @param index Index to begin at
     * @param size Number of mappings to remove
     */
    public void removeAtRange(int index, int size) {
        //确定结束位置
        final int end = Math.min(mSize, index + size);
        //从起点开始循环 remove
        for (int i = index; i < end; i++) {
            removeAt(i);
        }
    }

    private void gc() {
        // Log.e("SparseArray", "gc start with " + mSize);

        //'挤'空间了, 提供空间利用率
        int n = mSize;
        int o = 0;
        int[] keys = mKeys;
        Object[] values = mValues;

        //循环整个元素区间
        /*
         * 模拟一遍就明白了
         *i    = 0 1 2 3 4 5  6 
         *key  = 1 2 6 8 9 10 11
         *value= x y z D u D  w
         *value D代表delete的对象
         *
         *i=0时,val==x, val != DELETED,i==o, o++, o == 1
         *i=1时,val==y, val != DELETED,i==o, o++, o == 2
         *i=2时,val==z, val != DELETED,i==o, o++, o == 3
         *i=3时,val==D, val == DELETED,o == 3
         *i=4时,val==u, val != DELETED,i!=o, keys[3] = keys[4], values[3] = values[4], values[4] = null, o++, o == 4
         *   (i=4之后,key= 1 2 6 9 9 10 11, value= x y z u null D w)
         *i=5时,val==D, val == DELETED,o == 4
         *   (i=5之后,key= 1 2 6 9 9 10 11, value= x y z u null D w)较上一步没变化
         *i=6时,val==w, val != DELETED,i!=o, keys[4] = keys[6], values[4] = values[6], values[6] = null, o++, o == 5
         *   (i=6之后,key= 1 2 6 9 11 10 11, value= x y z u w D null)
         *循环结束   
         *mSize = 5
         */
        for (int i = 0; i < n; i++) {
            Object val = values[i];

            if (val != DELETED) {
                if (i != o) {
                    keys[o] = keys[i];
                    values[o] = val;
                    values[i] = null;
                }

                o++;
            }
        }

        mGarbage = false;
        mSize = o;

        // Log.e("SparseArray", "gc end with " + mSize);
    }

    /**
     * Adds a mapping from the specified key to the specified value,
     * replacing the previous mapping from the specified key if there
     * was one.
     */
    public void put(int key, E value) {
        //先二分查找,确定插入位置,保证了key数组的有序性
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

        if (i >= 0) {
            //找到了,直接替换
            mValues[i] = value;
        } else {
            //一点小技巧,跟二分查找的返回值有关
            //没找到的情况下i的意义是 i = -insertPoint -1,比如i=-2,则insertPoint=1
            //而-2在内存中存的是补码,对他取反刚好得insertPoint,跟上面计算结果一样,但位操作更高效
            i = ~i;

            //若i在size范围内,且刚好对应位置标记为delete了,直接放入
            if (i < mSize && mValues[i] == DELETED) {
                mKeys[i] = key;
                mValues[i] = value;
                return;
            }

            //若前面if不成立,即i超出了size范围,或者对应的位置的元素是有效的
            if (mGarbage && mSize >= mKeys.length) {
                //压缩空间
                gc();

                //重新查找插入点
                // Search again because indices may have changed.
                i = ~ContainerHelpers.binarySearch(mKeys, mSize, key);
            }
            
            //插入,若空间不够则会重新分配数组
            mKeys = GrowingArrayUtils.insert(mKeys, mSize, i, key);
            mValues = GrowingArrayUtils.insert(mValues, mSize, i, value);
            mSize++;
        }
    }

    /**
     * Returns the number of key-value mappings that this SparseArray
     * currently stores.
     */
    public int size() {
        if (mGarbage) {
            gc();
        }

        return mSize;
    }

    /**
     * Given an index in the range <code>0...size()-1</code>, returns
     * the key from the <code>index</code>th key-value mapping that this
     * SparseArray stores.
     *
     * <p>The keys corresponding to indices in ascending order are guaranteed to
     * be in ascending order, e.g., <code>keyAt(0)</code> will return the
     * smallest key and <code>keyAt(size()-1)</code> will return the largest
     * key.</p>
     */
    public int keyAt(int index) {
        if (mGarbage) {
            gc();
        }

        return mKeys[index];
    }

    /**
     * Given an index in the range <code>0...size()-1</code>, returns
     * the value from the <code>index</code>th key-value mapping that this
     * SparseArray stores.
     *
     * <p>The values corresponding to indices in ascending order are guaranteed
     * to be associated with keys in ascending order, e.g.,
     * <code>valueAt(0)</code> will return the value associated with the
     * smallest key and <code>valueAt(size()-1)</code> will return the value
     * associated with the largest key.</p>
     */
    @SuppressWarnings("unchecked")
    public E valueAt(int index) {
        if (mGarbage) {
            gc();
        }

        return (E) mValues[index];
    }

    /**
     * Given an index in the range <code>0...size()-1</code>, sets a new
     * value for the <code>index</code>th key-value mapping that this
     * SparseArray stores.
     */
    public void setValueAt(int index, E value) {
        if (mGarbage) {
            gc();
        }

        mValues[index] = value;
    }

    /**
     * Returns the index for which {@link #keyAt} would return the
     * specified key, or a negative number if the specified
     * key is not mapped.
     */
    public int indexOfKey(int key) {
        if (mGarbage) {
            gc();
        }

        return ContainerHelpers.binarySearch(mKeys, mSize, key);
    }

    /**
     * Returns an index for which {@link #valueAt} would return the
     * specified key, or a negative number if no keys map to the
     * specified value.
     * <p>Beware that this is a linear search, unlike lookups by key,
     * and that multiple keys can map to the same value and this will
     * find only one of them.
     * <p>Note also that unlike most collections' {@code indexOf} methods,
     * this method compares values using {@code ==} rather than {@code equals}.
     */
    public int indexOfValue(E value) {
        if (mGarbage) {
            gc();
        }

        for (int i = 0; i < mSize; i++)
            if (mValues[i] == value)
                return i;

        return -1;
    }

    /**
     * Removes all key-value mappings from this SparseArray.
     */
    public void clear() {
        int n = mSize;
        Object[] values = mValues;

        for (int i = 0; i < n; i++) {
            //值空,利于jvm gc
            values[i] = null;
        }

        mSize = 0;
        mGarbage = false;
    }

    /**
     * Puts a key/value pair into the array, optimizing for the case where
     * the key is greater than all existing keys in the array.
     */
    public void append(int key, E value) {
        //若key小于等于已有的最大key,直接Put
        if (mSize != 0 && key <= mKeys[mSize - 1]) {
            put(key, value);
            return;
        }

        if (mGarbage && mSize >= mKeys.length) {
            gc();
        }

        //若key大于了现有的所有key,就不用走put的二分查找过程了,直接append
        //即comments说的优化
        mKeys = GrowingArrayUtils.append(mKeys, mSize, key);
        mValues = GrowingArrayUtils.append(mValues, mSize, value);
        mSize++;
    }

    /**
     * {@inheritDoc}
     *
     * <p>This implementation composes a string by iterating over its mappings. If
     * this map contains itself as a value, the string "(this Map)"
     * will appear in its place.
     */
    @Override
    public String toString() {
        if (size() <= 0) {
            return "{}";
        }

        StringBuilder buffer = new StringBuilder(mSize * 28);
        buffer.append('{');
        for (int i=0; i<mSize; i++) {
            if (i > 0) {
                buffer.append(", ");
            }
            int key = keyAt(i);
            buffer.append(key);
            buffer.append('=');
            Object value = valueAt(i);
            if (value != this) {
                buffer.append(value);
            } else {
                buffer.append("(this Map)");
            }
        }
        buffer.append('}');
        return buffer.toString();
    }
}

问题1,mGarbage标记结合DELETED,用以提升性能,体现在哪里? 
    将可能的多次gc操作变为一次完成。比如用户调用了多次removeAt操作,若不用这种策略,则会每次remove都对应一次gc即用一次循环来'挤压'空间。
    采用这种flag优化后,多次remove仅多次设置了标志,在gc触发时,仅需要一次循环就可以将空间'挤压'好。
    
问题2,sparseArray的存储结构,两个数组加一个size有的特性?
    特性一,key数组是有序的,key在数组的index和vale在数组的index一样
    特性二,sparseArray每次gc过程,保证了他的有效值(size)区间内没有无效值,或者说'无缝隙',有效值是连续的
    特性三,如文档所说,当map Integers to Objects时,相对hashMap,sparseArray被设计成更加的内存高效,
           sparseArray避免了自动装箱机制和用额外的对象来存储每个映射。
    
问题3,ArrayUtils.newUnpaddedObjectArray(initialCapacity);有什么特性?
    最终调用的是VMRuntime.getRuntime().newUnpaddedArray(Object.class, minLen);
    参考 http://androidxref.com/5.0.0_r2/xref/libcore/libart/src/main/java/dalvik/system/VMRuntime.java#260

问题4, 二分查找的返回值含义,若值存在则返回该值的Index,否则返回和该值的插入点相关的一个值(-(insertion point) - 1)
   (the index of the search key, if it is contained in the list; otherwise, (-(insertion point) - 1).) 

小结

  • SparseArray的特性和一些内部实现见上面代码中的comments
  • SparseArray get put都用了二分查找,时间复杂的是O(log n), vs HashMap的get put的时间负责度是O(1)
  • SparseArray remove delete put时都有空间挪动的开销
  • SparseArray相对于hashMap优化在哪里?避免了AutoBox和不用额外的空间来存映射关系,故它的唯一有点就是针对小数据量时的空间效率更高,尤其适合手机这种设备
  • 在开发android时,若写这句话HashMap<Integer, String> hashMap = new HashMap<Integer, String>();时,IDE会警告Use new SparseArray<String>(...) instead for better performance你得视你的数据量情况而定了

参考链接

Clone this wiki locally